<para>
Before you can do anything, you must initialize a database storage
area on disk. We call this a <firstterm>database cluster</firstterm>.
- (<acronym>SQL</acronym> uses the term catalog cluster.) A
+ (The <acronym>SQL</acronym> standard uses the term catalog cluster.) A
database cluster is a collection of databases that is managed by a
single instance of a running database server. After initialization, a
database cluster will contain a database named <literal>postgres</literal>,
</para>
<para>
- In file system terms, a database cluster will be a single directory
+ In file system terms, a database cluster is a single directory
under which all data will be stored. We call this the <firstterm>data
directory</firstterm> or <firstterm>data area</firstterm>. It is
completely up to you where you choose to store your data. There is no
<para>
<command>initdb</command> will attempt to create the directory you
- specify if it does not already exist. It is likely that it will not
- have the permission to do so (if you followed our advice and created
- an unprivileged account). In that case you should create the
- directory yourself (as root) and change the owner to be the
- <productname>PostgreSQL</productname> user. Here is how this might
- be done:
+ specify if it does not already exist. Of course, this will fail if
+ <command>initdb</command> does not have permissions to write in the
+ parent directory. It's generally recommendable that the
+ <productname>PostgreSQL</productname> user own not just the data
+ directory but its parent directory as well, so that this should not
+ be a problem. If the desired parent directory doesn't exist either,
+ you will need to create it first, using root privileges if the
+ grandparent directory isn't writable. So the process might look
+ like this:
<screen>
-root# <userinput>mkdir /usr/local/pgsql/data</userinput>
-root# <userinput>chown postgres /usr/local/pgsql/data</userinput>
+root# <userinput>mkdir /usr/local/pgsql</userinput>
+root# <userinput>chown postgres /usr/local/pgsql</userinput>
root# <userinput>su postgres</userinput>
postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
</screen>
<para>
<command>initdb</command> will refuse to run if the data directory
- looks like it has already been initialized.</para>
+ exists and already contains files; this is to prevent accidentally
+ overwriting an existing installation.
+ </para>
<para>
Because the data directory contains all the data stored in the
locale setting. For details see <xref linkend="multibyte">.
</para>
+ <sect2 id="creating-cluster-mount-points">
+ <title>Use of Secondary File Systems</title>
+
+ <indexterm zone="creating-cluster-mount-points">
+ <primary>file system mount points</primary>
+ </indexterm>
+
+ <para>
+ Many installations create their database clusters on file systems
+ (volumes) other than the machine's <quote>root</> volume. If you
+ choose to do this, it is not advisable to try to use the secondary
+ volume's topmost directory (mount point) as the data directory.
+ Best practice is to create a directory within the mount-point
+ directory that is owned by the <productname>PostgreSQL</productname>
+ user, and then create the data directory within that. This avoids
+ permissions problems, particularly for operations such
+ as <application>pg_upgrade</>, and it also ensures clean failures if
+ the secondary volume is taken offline.
+ </para>
+
+ </sect2>
+
<sect2 id="creating-cluster-nfs">
- <title>Network File Systems</title>
+ <title>Use of Network File Systems</title>
<indexterm zone="creating-cluster-nfs">
<primary>Network File Systems</primary>
<indexterm><primary>Network Attached Storage (<acronym>NAS</>)</><see>Network File Systems</></>
<para>
- Many installations create database clusters on network file systems.
- Sometimes this is done directly via <acronym>NFS</>, or by using a
+ Many installations create their database clusters on network file
+ systems. Sometimes this is done via <acronym>NFS</>, or by using a
Network Attached Storage (<acronym>NAS</>) device that uses
<acronym>NFS</> internally. <productname>PostgreSQL</> does nothing
special for <acronym>NFS</> file systems, meaning it assumes
- <acronym>NFS</> behaves exactly like locally-connected drives
- (<acronym>DAS</>, Direct Attached Storage). If client and server
- <acronym>NFS</> implementations have non-standard semantics, this can
+ <acronym>NFS</> behaves exactly like locally-connected drives.
+ If the client or server <acronym>NFS</> implementation does not
+ provide standard file system semantics, this can
cause reliability problems (see <ulink
url="http://www.time-travellers.org/shane/papers/NFS_considered_harmful.html"></ulink>).
Specifically, delayed (asynchronous) writes to the <acronym>NFS</>
- server can cause reliability problems; if possible, mount
- <acronym>NFS</> file systems synchronously (without caching) to avoid
- this. Also, soft-mounting <acronym>NFS</> is not recommended.
- (Storage Area Networks (<acronym>SAN</>) use a low-level
- communication protocol rather than <acronym>NFS</>.)
+ server can cause data corruption problems. If possible, mount the
+ <acronym>NFS</> file system synchronously (without caching) to avoid
+ this hazard. Also, soft-mounting the <acronym>NFS</> file system is
+ not recommended.
+ </para>
+
+ <para>
+ Storage Area Networks (<acronym>SAN</>) typically use communication
+ protocols other than <acronym>NFS</>, and may or may not be subject
+ to hazards of this sort. It's advisable to consult the vendor's
+ documentation concerning data consistency guarantees.
+ <productname>PostgreSQL</productname> cannot be more reliable than
+ the file system it's using.
</para>
</sect2>