Update our documentation concerning where to create data directories.

author Tom Lane <tgl@sss.pgh.pa.us>

Tue, 28 Jul 2015 22:42:59 +0000 (18:42 -0400)

committer Tom Lane <tgl@sss.pgh.pa.us>

Tue, 28 Jul 2015 22:42:59 +0000 (18:42 -0400)
author Tom Lane <tgl@sss.pgh.pa.us>
Tue, 28 Jul 2015 22:42:59 +0000 (18:42 -0400)
committer Tom Lane <tgl@sss.pgh.pa.us>
Tue, 28 Jul 2015 22:42:59 +0000 (18:42 -0400)
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml

index 51cd3083ab3ece0942a427c69370074753e5b07a..06f9555ec061457519df39d86e58a3b6cbd44c06 100644 (file)
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -49,7 +49,7 @@
    <para>
     Before you can do anything, you must initialize a database storage
     area on disk. We call this a <firstterm>database cluster</firstterm>.
-   (<acronym>SQL</acronym> uses the term catalog cluster.) A
+   (The <acronym>SQL</acronym> standard uses the term catalog cluster.) A
     database cluster is a collection of databases that is managed by a
     single instance of a running database server. After initialization, a
     database cluster will contain a database named <literal>postgres</literal>,
@@ -65,7 +65,7 @@
    </para>
  
    <para>
-   In file system terms, a database cluster will be a single directory
+   In file system terms, a database cluster is a single directory
     under which all data will be stored. We call this the <firstterm>data
     directory</firstterm> or <firstterm>data area</firstterm>. It is
     completely up to you where you choose to store your data.  There is no
@@ -109,15 +109,18 @@
  
    <para>
     <command>initdb</command> will attempt to create the directory you
-   specify if it does not already exist. It is likely that it will not
-   have the permission to do so (if you followed our advice and created
-   an unprivileged account). In that case you should create the
-   directory yourself (as root) and change the owner to be the
-   <productname>PostgreSQL</productname> user. Here is how this might
-   be done:
+   specify if it does not already exist.  Of course, this will fail if
+   <command>initdb</command> does not have permissions to write in the
+   parent directory.  It's generally recommendable that the
+   <productname>PostgreSQL</productname> user own not just the data
+   directory but its parent directory as well, so that this should not
+   be a problem.  If the desired parent directory doesn't exist either,
+   you will need to create it first, using root privileges if the
+   grandparent directory isn't writable.  So the process might look
+   like this:
  <screen>
-root# <userinput>mkdir /usr/local/pgsql/data</userinput>
-root# <userinput>chown postgres /usr/local/pgsql/data</userinput>
+root# <userinput>mkdir /usr/local/pgsql</userinput>
+root# <userinput>chown postgres /usr/local/pgsql</userinput>
  root# <userinput>su postgres</userinput>
  postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
  </screen>
@@ -125,7 +128,9 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
  
    <para>
     <command>initdb</command> will refuse to run if the data directory
-   looks like it has already been initialized.</para>
+   exists and already contains files; this is to prevent accidentally
+   overwriting an existing installation.
+  </para>
  
    <para>
     Because the data directory contains all the data stored in the
@@ -175,8 +180,30 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
     locale setting.  For details see <xref linkend="multibyte">.
    </para>
  
+  <sect2 id="creating-cluster-mount-points">
+   <title>Use of Secondary File Systems</title>
+
+   <indexterm zone="creating-cluster-mount-points">
+    <primary>file system mount points</primary>
+   </indexterm>
+
+   <para>
+    Many installations create their database clusters on file systems
+    (volumes) other than the machine's <quote>root</> volume.  If you
+    choose to do this, it is not advisable to try to use the secondary
+    volume's topmost directory (mount point) as the data directory.
+    Best practice is to create a directory within the mount-point
+    directory that is owned by the <productname>PostgreSQL</productname>
+    user, and then create the data directory within that.  This avoids
+    permissions problems, particularly for operations such
+    as <application>pg_upgrade</>, and it also ensures clean failures if
+    the secondary volume is taken offline.
+   </para>
+
+  </sect2>
+
    <sect2 id="creating-cluster-nfs">
-   <title>Network File Systems</title>
+   <title>Use of Network File Systems</title>
  
     <indexterm zone="creating-cluster-nfs">
      <primary>Network File Systems</primary>
@@ -185,22 +212,30 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
     <indexterm><primary>Network Attached Storage (<acronym>NAS</>)</><see>Network File Systems</></>
  
     <para>
-    Many installations create database clusters on network file systems.
-    Sometimes this is done directly via <acronym>NFS</>, or by using a
+    Many installations create their database clusters on network file
+    systems.  Sometimes this is done via <acronym>NFS</>, or by using a
      Network Attached Storage (<acronym>NAS</>) device that uses
      <acronym>NFS</> internally.  <productname>PostgreSQL</> does nothing
      special for <acronym>NFS</> file systems, meaning it assumes
-    <acronym>NFS</> behaves exactly like locally-connected drives
-    (<acronym>DAS</>, Direct Attached Storage).  If client and server
-    <acronym>NFS</> implementations have non-standard semantics, this can
+    <acronym>NFS</> behaves exactly like locally-connected drives.
+    If the client or server <acronym>NFS</> implementation does not
+    provide standard file system semantics, this can
      cause reliability problems (see <ulink
      url="http://www.time-travellers.org/shane/papers/NFS_considered_harmful.html"></ulink>).
      Specifically, delayed (asynchronous) writes to the <acronym>NFS</>
-    server can cause reliability problems;   if possible, mount
-    <acronym>NFS</> file systems synchronously (without caching) to avoid
-    this.  Also, soft-mounting <acronym>NFS</> is not recommended.
-    (Storage Area Networks (<acronym>SAN</>) use a low-level
-    communication protocol rather than <acronym>NFS</>.)
+    server can cause data corruption problems.  If possible, mount the
+    <acronym>NFS</> file system synchronously (without caching) to avoid
+    this hazard.  Also, soft-mounting the <acronym>NFS</> file system is
+    not recommended.
+   </para>
+
+   <para>
+    Storage Area Networks (<acronym>SAN</>) typically use communication
+    protocols other than <acronym>NFS</>, and may or may not be subject
+    to hazards of this sort.  It's advisable to consult the vendor's
+    documentation concerning data consistency guarantees.
+    <productname>PostgreSQL</productname> cannot be more reliable than
+    the file system it's using.
     </para>
  
    </sect2>
author	Tom Lane <tgl@sss.pgh.pa.us>
	Tue, 28 Jul 2015 22:42:59 +0000 (18:42 -0400)
committer	Tom Lane <tgl@sss.pgh.pa.us>
	Tue, 28 Jul 2015 22:42:59 +0000 (18:42 -0400)