granicus.if.org Git - postgresql/blob - doc/src/sgml/backup.sgml

   1 <!-- doc/src/sgml/backup.sgml -->
   2
   3 <chapter id="backup">
   4  <title>Backup and Restore</title>
   5
   6  <indexterm zone="backup"><primary>backup</primary></indexterm>
   7
   8  <para>
   9   As with everything that contains valuable data, <productname>PostgreSQL</productname>
  10   databases should be backed up regularly. While the procedure is
  11   essentially simple, it is important to have a clear understanding of
  12   the underlying techniques and assumptions.
  13  </para>
  14
  15  <para>
  16   There are three fundamentally different approaches to backing up
  17   <productname>PostgreSQL</productname> data:
  18   <itemizedlist>
  19    <listitem><para><acronym>SQL</acronym> dump</para></listitem>
  20    <listitem><para>File system level backup</para></listitem>
  21    <listitem><para>Continuous archiving</para></listitem>
  22   </itemizedlist>
  23   Each has its own strengths and weaknesses; each is discussed in turn
  24   in the following sections.
  25  </para>
  26
  27  <sect1 id="backup-dump">
  28   <title><acronym>SQL</acronym> Dump</title>
  29
  30   <para>
  31    The idea behind this dump method is to generate a file with SQL
  32    commands that, when fed back to the server, will recreate the
  33    database in the same state as it was at the time of the dump.
  34    <productname>PostgreSQL</productname> provides the utility program
  35    <xref linkend="app-pgdump"/> for this purpose. The basic usage of this
  36    command is:
  37 <synopsis>
  38 pg_dump <replaceable class="parameter">dbname</replaceable> &gt; <replaceable class="parameter">dumpfile</replaceable>
  39 </synopsis>
  40    As you see, <application>pg_dump</application> writes its result to the
  41    standard output. We will see below how this can be useful.
  42    While the above command creates a text file, <application>pg_dump</application>
  43    can create files in other formats that allow for parallelism and more
  44    fine-grained control of object restoration.
  45   </para>
  46
  47   <para>
  48    <application>pg_dump</application> is a regular <productname>PostgreSQL</productname>
  49    client application (albeit a particularly clever one). This means
  50    that you can perform this backup procedure from any remote host that has
  51    access to the database. But remember that <application>pg_dump</application>
  52    does not operate with special permissions. In particular, it must
  53    have read access to all tables that you want to back up, so in order
  54    to back up the entire database you almost always have to run it as a
  55    database superuser.  (If you do not have sufficient privileges to back up
  56    the entire database, you can still back up portions of the database to which
  57    you do have access using options such as
  58    <option>-n <replaceable>schema</replaceable></option>
  59    or <option>-t <replaceable>table</replaceable></option>.)
  60   </para>
  61
  62   <para>
  63    To specify which database server <application>pg_dump</application> should
  64    contact, use the command line options <option>-h
  65    <replaceable>host</replaceable></option> and <option>-p <replaceable>port</replaceable></option>. The
  66    default host is the local host or whatever your
  67    <envar>PGHOST</envar> environment variable specifies. Similarly,
  68    the default port is indicated by the <envar>PGPORT</envar>
  69    environment variable or, failing that, by the compiled-in default.
  70    (Conveniently, the server will normally have the same compiled-in
  71    default.)
  72   </para>
  73
  74   <para>
  75    Like any other <productname>PostgreSQL</productname> client application,
  76    <application>pg_dump</application> will by default connect with the database
  77    user name that is equal to the current operating system user name. To override
  78    this, either specify the <option>-U</option> option or set the
  79    environment variable <envar>PGUSER</envar>. Remember that
  80    <application>pg_dump</application> connections are subject to the normal
  81    client authentication mechanisms (which are described in <xref
  82    linkend="client-authentication"/>).
  83   </para>
  84
  85   <para>
  86    An important advantage of <application>pg_dump</application> over the other backup
  87    methods described later is that <application>pg_dump</application>'s output can
  88    generally be re-loaded into newer versions of <productname>PostgreSQL</productname>,
  89    whereas file-level backups and continuous archiving are both extremely
  90    server-version-specific.  <application>pg_dump</application> is also the only method
  91    that will work when transferring a database to a different machine
  92    architecture, such as going from a 32-bit to a 64-bit server.
  93   </para>
  94
  95   <para>
  96    Dumps created by <application>pg_dump</application> are internally consistent,
  97    meaning, the dump represents a snapshot of the database at the time
  98    <application>pg_dump</application> began running. <application>pg_dump</application> does not
  99    block other operations on the database while it is working.
 100    (Exceptions are those operations that need to operate with an
 101    exclusive lock, such as most forms of <command>ALTER TABLE</command>.)
 102   </para>
 103
 104   <sect2 id="backup-dump-restore">
 105    <title>Restoring the Dump</title>
 106
 107    <para>
 108     Text files created by <application>pg_dump</application> are intended to
 109     be read in by the <application>psql</application> program. The
 110     general command form to restore a dump is
 111 <synopsis>
 112 psql <replaceable class="parameter">dbname</replaceable> &lt; <replaceable class="parameter">dumpfile</replaceable>
 113 </synopsis>
 114     where <replaceable class="parameter">dumpfile</replaceable> is the
 115     file output by the <application>pg_dump</application> command. The database <replaceable
 116     class="parameter">dbname</replaceable> will not be created by this
 117     command, so you must create it yourself from <literal>template0</literal>
 118     before executing <application>psql</application> (e.g., with
 119     <literal>createdb -T template0 <replaceable
 120     class="parameter">dbname</replaceable></literal>).  <application>psql</application>
 121     supports options similar to <application>pg_dump</application> for specifying
 122     the database server to connect to and the user name to use. See
 123     the <xref linkend="app-psql"/> reference page for more information.
 124     Non-text file dumps are restored using the <xref
 125     linkend="app-pgrestore"/> utility.
 126    </para>
 127
 128    <para>
 129     Before restoring an SQL dump, all the users who own objects or were
 130     granted permissions on objects in the dumped database must already
 131     exist. If they do not, the restore will fail to recreate the
 132     objects with the original ownership and/or permissions.
 133     (Sometimes this is what you want, but usually it is not.)
 134    </para>
 135
 136    <para>
 137     By default, the <application>psql</application> script will continue to
 138     execute after an SQL error is encountered. You might wish to run
 139     <application>psql</application> with
 140     the <literal>ON_ERROR_STOP</literal> variable set to alter that
 141     behavior and have <application>psql</application> exit with an
 142     exit status of 3 if an SQL error occurs:
 143 <programlisting>
 144 psql --set ON_ERROR_STOP=on <replaceable>dbname</replaceable> &lt; <replaceable>dumpfile</replaceable>
 145 </programlisting>
 146     Either way, you will only have a partially restored database.
 147     Alternatively, you can specify that the whole dump should be
 148     restored as a single transaction, so the restore is either fully
 149     completed or fully rolled back. This mode can be specified by
 150     passing the <option>-1</option> or <option>--single-transaction</option>
 151     command-line options to <application>psql</application>. When using this
 152     mode, be aware that even a minor error can rollback a
 153     restore that has already run for many hours. However, that might
 154     still be preferable to manually cleaning up a complex database
 155     after a partially restored dump.
 156    </para>
 157
 158    <para>
 159     The ability of <application>pg_dump</application> and <application>psql</application> to
 160     write to or read from pipes makes it possible to dump a database
 161     directly from one server to another, for example:
 162 <programlisting>
 163 pg_dump -h <replaceable>host1</replaceable> <replaceable>dbname</replaceable> | psql -h <replaceable>host2</replaceable> <replaceable>dbname</replaceable>
 164 </programlisting>
 165    </para>
 166
 167    <important>
 168     <para>
 169      The dumps produced by <application>pg_dump</application> are relative to
 170      <literal>template0</literal>. This means that any languages, procedures,
 171      etc. added via <literal>template1</literal> will also be dumped by
 172      <application>pg_dump</application>. As a result, when restoring, if you are
 173      using a customized <literal>template1</literal>, you must create the
 174      empty database from <literal>template0</literal>, as in the example
 175      above.
 176     </para>
 177    </important>
 178
 179    <para>
 180     After restoring a backup, it is wise to run <xref
 181     linkend="sql-analyze"/> on each
 182     database so the query optimizer has useful statistics;
 183     see <xref linkend="vacuum-for-statistics"/>
 184     and <xref linkend="autovacuum"/> for more information.
 185     For more advice on how to load large amounts of data
 186     into <productname>PostgreSQL</productname> efficiently, refer to <xref
 187     linkend="populate"/>.
 188    </para>
 189   </sect2>
 190
 191   <sect2 id="backup-dump-all">
 192    <title>Using <application>pg_dumpall</application></title>
 193
 194    <para>
 195     <application>pg_dump</application> dumps only a single database at a time,
 196     and it does not dump information about roles or tablespaces
 197     (because those are cluster-wide rather than per-database).
 198     To support convenient dumping of the entire contents of a database
 199     cluster, the <xref linkend="app-pg-dumpall"/> program is provided.
 200     <application>pg_dumpall</application> backs up each database in a given
 201     cluster, and also preserves cluster-wide data such as role and
 202     tablespace definitions. The basic usage of this command is:
 203 <synopsis>
 204 pg_dumpall &gt; <replaceable>dumpfile</replaceable>
 205 </synopsis>
 206     The resulting dump can be restored with <application>psql</application>:
 207 <synopsis>
 208 psql -f <replaceable class="parameter">dumpfile</replaceable> postgres
 209 </synopsis>
 210     (Actually, you can specify any existing database name to start from,
 211     but if you are loading into an empty cluster then <literal>postgres</literal>
 212     should usually be used.)  It is always necessary to have
 213     database superuser access when restoring a <application>pg_dumpall</application>
 214     dump, as that is required to restore the role and tablespace information.
 215     If you use tablespaces, make sure that the tablespace paths in the
 216     dump are appropriate for the new installation.
 217    </para>
 218
 219    <para>
 220     <application>pg_dumpall</application> works by emitting commands to re-create
 221     roles, tablespaces, and empty databases, then invoking
 222     <application>pg_dump</application> for each database.  This means that while
 223     each database will be internally consistent, the snapshots of
 224     different databases are not synchronized.
 225    </para>
 226
 227    <para>
 228     Cluster-wide data can be dumped alone using the
 229     <application>pg_dumpall</application> <option>--globals-only</option> option.
 230     This is necessary to fully backup the cluster if running the
 231     <application>pg_dump</application> command on individual databases.
 232    </para>
 233   </sect2>
 234
 235   <sect2 id="backup-dump-large">
 236    <title>Handling Large Databases</title>
 237
 238    <para>
 239     Some operating systems have maximum file size limits that cause
 240     problems when creating large <application>pg_dump</application> output files.
 241     Fortunately, <application>pg_dump</application> can write to the standard
 242     output, so you can use standard Unix tools to work around this
 243     potential problem.  There are several possible methods:
 244    </para>
 245
 246    <formalpara>
 247     <title>Use compressed dumps.</title>
 248     <para>
 249      You can use your favorite compression program, for example
 250      <application>gzip</application>:
 251
 252 <programlisting>
 253 pg_dump <replaceable class="parameter">dbname</replaceable> | gzip &gt; <replaceable class="parameter">filename</replaceable>.gz
 254 </programlisting>
 255
 256      Reload with:
 257
 258 <programlisting>
 259 gunzip -c <replaceable class="parameter">filename</replaceable>.gz | psql <replaceable class="parameter">dbname</replaceable>
 260 </programlisting>
 261
 262      or:
 263
 264 <programlisting>
 265 cat <replaceable class="parameter">filename</replaceable>.gz | gunzip | psql <replaceable class="parameter">dbname</replaceable>
 266 </programlisting>
 267     </para>
 268    </formalpara>
 269
 270    <formalpara>
 271     <title>Use <command>split</command>.</title>
 272     <para>
 273      The <command>split</command> command
 274      allows you to split the output into smaller files that are
 275      acceptable in size to the underlying file system. For example, to
 276      make chunks of 1 megabyte:
 277
 278 <programlisting>
 279 pg_dump <replaceable class="parameter">dbname</replaceable> | split -b 1m - <replaceable class="parameter">filename</replaceable>
 280 </programlisting>
 281
 282      Reload with:
 283
 284 <programlisting>
 285 cat <replaceable class="parameter">filename</replaceable>* | psql <replaceable class="parameter">dbname</replaceable>
 286 </programlisting>
 287     </para>
 288    </formalpara>
 289
 290    <formalpara>
 291     <title>Use <application>pg_dump</application>'s custom dump format.</title>
 292     <para>
 293      If <productname>PostgreSQL</productname> was built on a system with the
 294      <application>zlib</application> compression library installed, the custom dump
 295      format will compress data as it writes it to the output file. This will
 296      produce dump file sizes similar to using <command>gzip</command>, but it
 297      has the added advantage that tables can be restored selectively. The
 298      following command dumps a database using the custom dump format:
 299
 300 <programlisting>
 301 pg_dump -Fc <replaceable class="parameter">dbname</replaceable> &gt; <replaceable class="parameter">filename</replaceable>
 302 </programlisting>
 303
 304      A custom-format dump is not a script for <application>psql</application>, but
 305      instead must be restored with <application>pg_restore</application>, for example:
 306
 307 <programlisting>
 308 pg_restore -d <replaceable class="parameter">dbname</replaceable> <replaceable class="parameter">filename</replaceable>
 309 </programlisting>
 310
 311      See the <xref linkend="app-pgdump"/> and <xref
 312      linkend="app-pgrestore"/> reference pages for details.
 313     </para>
 314    </formalpara>
 315
 316    <para>
 317     For very large databases, you might need to combine <command>split</command>
 318     with one of the other two approaches.
 319    </para>
 320
 321    <formalpara>
 322     <title>Use <application>pg_dump</application>'s parallel dump feature.</title>
 323     <para>
 324      To speed up the dump of a large database, you can use
 325      <application>pg_dump</application>'s parallel mode. This will dump
 326      multiple tables at the same time. You can control the degree of
 327      parallelism with the <command>-j</command> parameter. Parallel dumps
 328      are only supported for the "directory" archive format.
 329
 330 <programlisting>
 331 pg_dump -j <replaceable class="parameter">num</replaceable> -F d -f <replaceable class="parameter">out.dir</replaceable> <replaceable class="parameter">dbname</replaceable>
 332 </programlisting>
 333
 334      You can use <command>pg_restore -j</command> to restore a dump in parallel.
 335      This will work for any archive of either the "custom" or the "directory"
 336      archive mode, whether or not it has been created with <command>pg_dump -j</command>.
 337     </para>
 338    </formalpara>
 339   </sect2>
 340  </sect1>
 341
 342  <sect1 id="backup-file">
 343   <title>File System Level Backup</title>
 344
 345   <para>
 346    An alternative backup strategy is to directly copy the files that
 347    <productname>PostgreSQL</productname> uses to store the data in the database;
 348    <xref linkend="creating-cluster"/> explains where these files
 349    are located.  You can use whatever method you prefer
 350    for doing file system backups; for example:
 351
 352 <programlisting>
 353 tar -cf backup.tar /usr/local/pgsql/data
 354 </programlisting>
 355   </para>
 356
 357   <para>
 358    There are two restrictions, however, which make this method
 359    impractical, or at least inferior to the <application>pg_dump</application>
 360    method:
 361
 362    <orderedlist>
 363     <listitem>
 364      <para>
 365       The database server <emphasis>must</emphasis> be shut down in order to
 366       get a usable backup. Half-way measures such as disallowing all
 367       connections will <emphasis>not</emphasis> work
 368       (in part because <command>tar</command> and similar tools do not take
 369       an atomic snapshot of the state of the file system,
 370       but also because of internal buffering within the server).
 371       Information about stopping the server can be found in
 372       <xref linkend="server-shutdown"/>.  Needless to say, you
 373       also need to shut down the server before restoring the data.
 374      </para>
 375     </listitem>
 376
 377     <listitem>
 378      <para>
 379       If you have dug into the details of the file system layout of the
 380       database, you might be tempted to try to back up or restore only certain
 381       individual tables or databases from their respective files or
 382       directories. This will <emphasis>not</emphasis> work because the
 383       information contained in these files is not usable without
 384       the commit log files,
 385       <filename>pg_xact/*</filename>, which contain the commit status of
 386       all transactions. A table file is only usable with this
 387       information. Of course it is also impossible to restore only a
 388       table and the associated <filename>pg_xact</filename> data
 389       because that would render all other tables in the database
 390       cluster useless.  So file system backups only work for complete
 391       backup and restoration of an entire database cluster.
 392      </para>
 393     </listitem>
 394    </orderedlist>
 395   </para>
 396
 397   <para>
 398    An alternative file-system backup approach is to make a
 399    <quote>consistent snapshot</quote> of the data directory, if the
 400    file system supports that functionality (and you are willing to
 401    trust that it is implemented correctly).  The typical procedure is
 402    to make a <quote>frozen snapshot</quote> of the volume containing the
 403    database, then copy the whole data directory (not just parts, see
 404    above) from the snapshot to a backup device, then release the frozen
 405    snapshot.  This will work even while the database server is running.
 406    However, a backup created in this way saves
 407    the database files in a state as if the database server was not
 408    properly shut down; therefore, when you start the database server
 409    on the backed-up data, it will think the previous server instance
 410    crashed and will replay the WAL log.  This is not a problem; just
 411    be aware of it (and be sure to include the WAL files in your backup).
 412    You can perform a <command>CHECKPOINT</command> before taking the
 413    snapshot to reduce recovery time.
 414   </para>
 415
 416   <para>
 417    If your database is spread across multiple file systems, there might not
 418    be any way to obtain exactly-simultaneous frozen snapshots of all
 419    the volumes.  For example, if your data files and WAL log are on different
 420    disks, or if tablespaces are on different file systems, it might
 421    not be possible to use snapshot backup because the snapshots
 422    <emphasis>must</emphasis> be simultaneous.
 423    Read your file system documentation very carefully before trusting
 424    the consistent-snapshot technique in such situations.
 425   </para>
 426
 427   <para>
 428    If simultaneous snapshots are not possible, one option is to shut down
 429    the database server long enough to establish all the frozen snapshots.
 430    Another option is to perform a continuous archiving base backup (<xref
 431    linkend="backup-base-backup"/>) because such backups are immune to file
 432    system changes during the backup.  This requires enabling continuous
 433    archiving just during the backup process; restore is done using
 434    continuous archive recovery (<xref linkend="backup-pitr-recovery"/>).
 435   </para>
 436
 437   <para>
 438    Another option is to use <application>rsync</application> to perform a file
 439    system backup.  This is done by first running <application>rsync</application>
 440    while the database server is running, then shutting down the database
 441    server long enough to do an <command>rsync --checksum</command>.
 442    (<option>--checksum</option> is necessary because <command>rsync</command> only
 443    has file modification-time granularity of one second.)  The
 444    second <application>rsync</application> will be quicker than the first,
 445    because it has relatively little data to transfer, and the end result
 446    will be consistent because the server was down.  This method
 447    allows a file system backup to be performed with minimal downtime.
 448   </para>
 449
 450   <para>
 451    Note that a file system backup will typically be larger
 452    than an SQL dump. (<application>pg_dump</application> does not need to dump
 453    the contents of indexes for example, just the commands to recreate
 454    them.)  However, taking a file system backup might be faster.
 455   </para>
 456  </sect1>
 457
 458  <sect1 id="continuous-archiving">
 459   <title>Continuous Archiving and Point-in-Time Recovery (PITR)</title>
 460
 461   <indexterm zone="backup">
 462    <primary>continuous archiving</primary>
 463   </indexterm>
 464
 465   <indexterm zone="backup">
 466    <primary>point-in-time recovery</primary>
 467   </indexterm>
 468
 469   <indexterm zone="backup">
 470    <primary>PITR</primary>
 471   </indexterm>
 472
 473   <para>
 474    At all times, <productname>PostgreSQL</productname> maintains a
 475    <firstterm>write ahead log</firstterm> (WAL) in the <filename>pg_wal/</filename>
 476    subdirectory of the cluster's data directory. The log records
 477    every change made to the database's data files.  This log exists
 478    primarily for crash-safety purposes: if the system crashes, the
 479    database can be restored to consistency by <quote>replaying</quote> the
 480    log entries made since the last checkpoint.  However, the existence
 481    of the log makes it possible to use a third strategy for backing up
 482    databases: we can combine a file-system-level backup with backup of
 483    the WAL files.  If recovery is needed, we restore the file system backup and
 484    then replay from the backed-up WAL files to bring the system to a
 485    current state.  This approach is more complex to administer than
 486    either of the previous approaches, but it has some significant
 487    benefits:
 488   <itemizedlist>
 489    <listitem>
 490     <para>
 491      We do not need a perfectly consistent file system backup as the starting point.
 492      Any internal inconsistency in the backup will be corrected by log
 493      replay (this is not significantly different from what happens during
 494      crash recovery).  So we do not need a file system snapshot capability,
 495      just <application>tar</application> or a similar archiving tool.
 496     </para>
 497    </listitem>
 498    <listitem>
 499     <para>
 500      Since we can combine an indefinitely long sequence of WAL files
 501      for replay, continuous backup can be achieved simply by continuing to archive
 502      the WAL files.  This is particularly valuable for large databases, where
 503      it might not be convenient to take a full backup frequently.
 504     </para>
 505    </listitem>
 506    <listitem>
 507     <para>
 508      It is not necessary to replay the WAL entries all the
 509      way to the end.  We could stop the replay at any point and have a
 510      consistent snapshot of the database as it was at that time.  Thus,
 511      this technique supports <firstterm>point-in-time recovery</firstterm>: it is
 512      possible to restore the database to its state at any time since your base
 513      backup was taken.
 514     </para>
 515    </listitem>
 516    <listitem>
 517     <para>
 518      If we continuously feed the series of WAL files to another
 519      machine that has been loaded with the same base backup file, we
 520      have a <firstterm>warm standby</firstterm> system: at any point we can bring up
 521      the second machine and it will have a nearly-current copy of the
 522      database.
 523     </para>
 524    </listitem>
 525   </itemizedlist>
 526   </para>
 527
 528   <note>
 529    <para>
 530     <application>pg_dump</application> and
 531     <application>pg_dumpall</application> do not produce file-system-level
 532     backups and cannot be used as part of a continuous-archiving solution.
 533     Such dumps are <emphasis>logical</emphasis> and do not contain enough
 534     information to be used by WAL replay.
 535    </para>
 536   </note>
 537
 538   <para>
 539    As with the plain file-system-backup technique, this method can only
 540    support restoration of an entire database cluster, not a subset.
 541    Also, it requires a lot of archival storage: the base backup might be bulky,
 542    and a busy system will generate many megabytes of WAL traffic that
 543    have to be archived.  Still, it is the preferred backup technique in
 544    many situations where high reliability is needed.
 545   </para>
 546
 547   <para>
 548    To recover successfully using continuous archiving (also called
 549    <quote>online backup</quote> by many database vendors), you need a continuous
 550    sequence of archived WAL files that extends back at least as far as the
 551    start time of your backup.  So to get started, you should set up and test
 552    your procedure for archiving WAL files <emphasis>before</emphasis> you take your
 553    first base backup.  Accordingly, we first discuss the mechanics of
 554    archiving WAL files.
 555   </para>
 556
 557   <sect2 id="backup-archiving-wal">
 558    <title>Setting Up WAL Archiving</title>
 559
 560    <para>
 561     In an abstract sense, a running <productname>PostgreSQL</productname> system
 562     produces an indefinitely long sequence of WAL records.  The system
 563     physically divides this sequence into WAL <firstterm>segment
 564     files</firstterm>, which are normally 16MB apiece (although the segment size
 565     can be altered during <application>initdb</application>).  The segment
 566     files are given numeric names that reflect their position in the
 567     abstract WAL sequence.  When not using WAL archiving, the system
 568     normally creates just a few segment files and then
 569     <quote>recycles</quote> them by renaming no-longer-needed segment files
 570     to higher segment numbers.  It's assumed that segment files whose
 571     contents precede the last checkpoint are no longer of
 572     interest and can be recycled.
 573    </para>
 574
 575    <para>
 576     When archiving WAL data, we need to capture the contents of each segment
 577     file once it is filled, and save that data somewhere before the segment
 578     file is recycled for reuse.  Depending on the application and the
 579     available hardware, there could be many different ways of <quote>saving
 580     the data somewhere</quote>: we could copy the segment files to an NFS-mounted
 581     directory on another machine, write them onto a tape drive (ensuring that
 582     you have a way of identifying the original name of each file), or batch
 583     them together and burn them onto CDs, or something else entirely.  To
 584     provide the database administrator with flexibility,
 585     <productname>PostgreSQL</productname> tries not to make any assumptions about how
 586     the archiving will be done.  Instead, <productname>PostgreSQL</productname> lets
 587     the administrator specify a shell command to be executed to copy a
 588     completed segment file to wherever it needs to go.  The command could be
 589     as simple as a <literal>cp</literal>, or it could invoke a complex shell
 590     script &mdash; it's all up to you.
 591    </para>
 592
 593    <para>
 594     To enable WAL archiving, set the <xref linkend="guc-wal-level"/>
 595     configuration parameter to <literal>replica</literal> or higher,
 596     <xref linkend="guc-archive-mode"/> to <literal>on</literal>,
 597     and specify the shell command to use in the <xref
 598     linkend="guc-archive-command"/> configuration parameter.  In practice
 599     these settings will always be placed in the
 600     <filename>postgresql.conf</filename> file.
 601     In <varname>archive_command</varname>,
 602     <literal>%p</literal> is replaced by the path name of the file to
 603     archive, while <literal>%f</literal> is replaced by only the file name.
 604     (The path name is relative to the current working directory,
 605     i.e., the cluster's data directory.)
 606     Use <literal>%%</literal> if you need to embed an actual <literal>%</literal>
 607     character in the command.  The simplest useful command is something
 608     like:
 609 <programlisting>
 610 archive_command = 'test ! -f /mnt/server/archivedir/%f &amp;&amp; cp %p /mnt/server/archivedir/%f'  # Unix
 611 archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows
 612 </programlisting>
 613     which will copy archivable WAL segments to the directory
 614     <filename>/mnt/server/archivedir</filename>.  (This is an example, not a
 615     recommendation, and might not work on all platforms.)  After the
 616     <literal>%p</literal> and <literal>%f</literal> parameters have been replaced,
 617     the actual command executed might look like this:
 618 <programlisting>
 619 test ! -f /mnt/server/archivedir/00000001000000A900000065 &amp;&amp; cp pg_wal/00000001000000A900000065 /mnt/server/archivedir/00000001000000A900000065
 620 </programlisting>
 621     A similar command will be generated for each new file to be archived.
 622    </para>
 623
 624    <para>
 625     The archive command will be executed under the ownership of the same
 626     user that the <productname>PostgreSQL</productname> server is running as.  Since
 627     the series of WAL files being archived contains effectively everything
 628     in your database, you will want to be sure that the archived data is
 629     protected from prying eyes; for example, archive into a directory that
 630     does not have group or world read access.
 631    </para>
 632
 633    <para>
 634     It is important that the archive command return zero exit status if and
 635     only if it succeeds.  Upon getting a zero result,
 636     <productname>PostgreSQL</productname> will assume that the file has been
 637     successfully archived, and will remove or recycle it.  However, a nonzero
 638     status tells <productname>PostgreSQL</productname> that the file was not archived;
 639     it will try again periodically until it succeeds.
 640    </para>
 641
 642    <para>
 643     The archive command should generally be designed to refuse to overwrite
 644     any pre-existing archive file.  This is an important safety feature to
 645     preserve the integrity of your archive in case of administrator error
 646     (such as sending the output of two different servers to the same archive
 647     directory).
 648    </para>
 649
 650    <para>
 651     It is advisable to test your proposed archive command to ensure that it
 652     indeed does not overwrite an existing file, <emphasis>and that it returns
 653     nonzero status in this case</emphasis>.
 654     The example command above for Unix ensures this by including a separate
 655     <command>test</command> step.  On some Unix platforms, <command>cp</command> has
 656     switches such as <option>-i</option> that can be used to do the same thing
 657     less verbosely, but you should not rely on these without verifying that
 658     the right exit status is returned.  (In particular, GNU <command>cp</command>
 659     will return status zero when <option>-i</option> is used and the target file
 660     already exists, which is <emphasis>not</emphasis> the desired behavior.)
 661    </para>
 662
 663    <para>
 664     While designing your archiving setup, consider what will happen if
 665     the archive command fails repeatedly because some aspect requires
 666     operator intervention or the archive runs out of space. For example, this
 667     could occur if you write to tape without an autochanger; when the tape
 668     fills, nothing further can be archived until the tape is swapped.
 669     You should ensure that any error condition or request to a human operator
 670     is reported appropriately so that the situation can be
 671     resolved reasonably quickly. The <filename>pg_wal/</filename> directory will
 672     continue to fill with WAL segment files until the situation is resolved.
 673     (If the file system containing <filename>pg_wal/</filename> fills up,
 674     <productname>PostgreSQL</productname> will do a PANIC shutdown.  No committed
 675     transactions will be lost, but the database will remain offline until
 676     you free some space.)
 677    </para>
 678
 679    <para>
 680     The speed of the archiving command is unimportant as long as it can keep up
 681     with the average rate at which your server generates WAL data.  Normal
 682     operation continues even if the archiving process falls a little behind.
 683     If archiving falls significantly behind, this will increase the amount of
 684     data that would be lost in the event of a disaster. It will also mean that
 685     the <filename>pg_wal/</filename> directory will contain large numbers of
 686     not-yet-archived segment files, which could eventually exceed available
 687     disk space. You are advised to monitor the archiving process to ensure that
 688     it is working as you intend.
 689    </para>
 690
 691    <para>
 692     In writing your archive command, you should assume that the file names to
 693     be archived can be up to 64 characters long and can contain any
 694     combination of ASCII letters, digits, and dots.  It is not necessary to
 695     preserve the original relative path (<literal>%p</literal>) but it is necessary to
 696     preserve the file name (<literal>%f</literal>).
 697    </para>
 698
 699    <para>
 700     Note that although WAL archiving will allow you to restore any
 701     modifications made to the data in your <productname>PostgreSQL</productname> database,
 702     it will not restore changes made to configuration files (that is,
 703     <filename>postgresql.conf</filename>, <filename>pg_hba.conf</filename> and
 704     <filename>pg_ident.conf</filename>), since those are edited manually rather
 705     than through SQL operations.
 706     You might wish to keep the configuration files in a location that will
 707     be backed up by your regular file system backup procedures.  See
 708     <xref linkend="runtime-config-file-locations"/> for how to relocate the
 709     configuration files.
 710    </para>
 711
 712    <para>
 713     The archive command is only invoked on completed WAL segments.  Hence,
 714     if your server generates only little WAL traffic (or has slack periods
 715     where it does so), there could be a long delay between the completion
 716     of a transaction and its safe recording in archive storage.  To put
 717     a limit on how old unarchived data can be, you can set
 718     <xref linkend="guc-archive-timeout"/> to force the server to switch
 719     to a new WAL segment file at least that often.  Note that archived
 720     files that are archived early due to a forced switch are still the same
 721     length as completely full files.  It is therefore unwise to set a very
 722     short <varname>archive_timeout</varname> &mdash; it will bloat your archive
 723     storage.  <varname>archive_timeout</varname> settings of a minute or so are
 724     usually reasonable.
 725    </para>
 726
 727    <para>
 728     Also, you can force a segment switch manually with
 729     <function>pg_switch_wal</function> if you want to ensure that a
 730     just-finished transaction is archived as soon as possible.  Other utility
 731     functions related to WAL management are listed in <xref
 732     linkend="functions-admin-backup-table"/>.
 733    </para>
 734
 735    <para>
 736     When <varname>wal_level</varname> is <literal>minimal</literal> some SQL commands
 737     are optimized to avoid WAL logging, as described in <xref
 738     linkend="populate-pitr"/>.  If archiving or streaming replication were
 739     turned on during execution of one of these statements, WAL would not
 740     contain enough information for archive recovery.  (Crash recovery is
 741     unaffected.)  For this reason, <varname>wal_level</varname> can only be changed at
 742     server start.  However, <varname>archive_command</varname> can be changed with a
 743     configuration file reload.  If you wish to temporarily stop archiving,
 744     one way to do it is to set <varname>archive_command</varname> to the empty
 745     string (<literal>''</literal>).
 746     This will cause WAL files to accumulate in <filename>pg_wal/</filename> until a
 747     working <varname>archive_command</varname> is re-established.
 748    </para>
 749   </sect2>
 750
 751   <sect2 id="backup-base-backup">
 752    <title>Making a Base Backup</title>
 753
 754    <para>
 755     The easiest way to perform a base backup is to use the
 756     <xref linkend="app-pgbasebackup"/> tool. It can create
 757     a base backup either as regular files or as a tar archive. If more
 758     flexibility than <xref linkend="app-pgbasebackup"/> can provide is
 759     required, you can also make a base backup using the low level API
 760     (see <xref linkend="backup-lowlevel-base-backup"/>).
 761    </para>
 762
 763    <para>
 764     It is not necessary to be concerned about the amount of time it takes
 765     to make a base backup. However, if you normally run the
 766     server with <varname>full_page_writes</varname> disabled, you might notice a drop
 767     in performance while the backup runs since <varname>full_page_writes</varname> is
 768     effectively forced on during backup mode.
 769    </para>
 770
 771    <para>
 772     To make use of the backup, you will need to keep all the WAL
 773     segment files generated during and after the file system backup.
 774     To aid you in doing this, the base backup process
 775     creates a <firstterm>backup history file</firstterm> that is immediately
 776     stored into the WAL archive area. This file is named after the first
 777     WAL segment file that you need for the file system backup.
 778     For example, if the starting WAL file is
 779     <literal>0000000100001234000055CD</literal> the backup history file will be
 780     named something like
 781     <literal>0000000100001234000055CD.007C9330.backup</literal>. (The second
 782     part of the file name stands for an exact position within the WAL
 783     file, and can ordinarily be ignored.) Once you have safely archived
 784     the file system backup and the WAL segment files used during the
 785     backup (as specified in the backup history file), all archived WAL
 786     segments with names numerically less are no longer needed to recover
 787     the file system backup and can be deleted. However, you should
 788     consider keeping several backup sets to be absolutely certain that
 789     you can recover your data.
 790    </para>
 791
 792    <para>
 793     The backup history file is just a small text file. It contains the
 794     label string you gave to <xref linkend="app-pgbasebackup"/>, as well as
 795     the starting and ending times and WAL segments of the backup.
 796     If you used the label to identify the associated dump file,
 797     then the archived history file is enough to tell you which dump file to
 798     restore.
 799    </para>
 800
 801    <para>
 802     Since you have to keep around all the archived WAL files back to your
 803     last base backup, the interval between base backups should usually be
 804     chosen based on how much storage you want to expend on archived WAL
 805     files.  You should also consider how long you are prepared to spend
 806     recovering, if recovery should be necessary &mdash; the system will have to
 807     replay all those WAL segments, and that could take awhile if it has
 808     been a long time since the last base backup.
 809    </para>
 810   </sect2>
 811
 812   <sect2 id="backup-lowlevel-base-backup">
 813    <title>Making a Base Backup Using the Low Level API</title>
 814    <para>
 815     The procedure for making a base backup using the low level
 816     APIs contains a few more steps than
 817     the <xref linkend="app-pgbasebackup"/> method, but is relatively
 818     simple. It is very important that these steps are executed in
 819     sequence, and that the success of a step is verified before
 820     proceeding to the next step.
 821    </para>
 822    <para>
 823     Low level base backups can be made in a non-exclusive or an exclusive
 824     way. The non-exclusive method is recommended and the exclusive one is
 825     deprecated and will eventually be removed.
 826    </para>
 827
 828    <sect3 id="backup-lowlevel-base-backup-nonexclusive">
 829     <title>Making a Non-Exclusive Low-Level Backup</title>
 830     <para>
 831      A non-exclusive low level backup is one that allows other
 832      concurrent backups to be running (both those started using
 833      the same backup API and those started using
 834      <xref linkend="app-pgbasebackup"/>).
 835     </para>
 836     <para>
 837   <orderedlist>
 838    <listitem>
 839     <para>
 840      Ensure that WAL archiving is enabled and working.
 841     </para>
 842    </listitem>
 843    <listitem>
 844     <para>
 845      Connect to the server (it does not matter which database) as a user with
 846      rights to run pg_start_backup (superuser, or a user who has been granted
 847      EXECUTE on the function) and issue the command:
 848 <programlisting>
 849 SELECT pg_start_backup('label', false, false);
 850 </programlisting>
 851      where <literal>label</literal> is any string you want to use to uniquely
 852      identify this backup operation. The connection
 853      calling <function>pg_start_backup</function> must be maintained until the end of
 854      the backup, or the backup will be automatically aborted.
 855     </para>
 856
 857     <para>
 858      By default, <function>pg_start_backup</function> can take a long time to finish.
 859      This is because it performs a checkpoint, and the I/O
 860      required for the checkpoint will be spread out over a significant
 861      period of time, by default half your inter-checkpoint interval
 862      (see the configuration parameter
 863      <xref linkend="guc-checkpoint-completion-target"/>).  This is
 864      usually what you want, because it minimizes the impact on query
 865      processing.  If you want to start the backup as soon as
 866      possible, change the second parameter to <literal>true</literal>, which will
 867      issue an immediate checkpoint using as much I/O as available.
 868     </para>
 869
 870     <para>
 871      The third parameter being <literal>false</literal> tells
 872      <function>pg_start_backup</function> to initiate a non-exclusive base backup.
 873     </para>
 874    </listitem>
 875    <listitem>
 876     <para>
 877      Perform the backup, using any convenient file-system-backup tool
 878      such as <application>tar</application> or <application>cpio</application> (not
 879      <application>pg_dump</application> or
 880      <application>pg_dumpall</application>).  It is neither
 881      necessary nor desirable to stop normal operation of the database
 882      while you do this. See
 883      <xref linkend="backup-lowlevel-base-backup-data"/> for things to
 884      consider during this backup.
 885     </para>
 886    </listitem>
 887    <listitem>
 888     <para>
 889      In the same connection as before, issue the command:
 890 <programlisting>
 891 SELECT * FROM pg_stop_backup(false, true);
 892 </programlisting>
 893      This terminates backup mode. On a primary, it also performs an automatic
 894      switch to the next WAL segment.  On a standby, it is not possible to
 895      automatically switch WAL segments, so you may wish to run
 896      <function>pg_switch_wal</function> on the primary to perform a manual
 897      switch. The reason for the switch is to arrange for
 898      the last WAL segment file written during the backup interval to be
 899      ready to archive.
 900     </para>
 901     <para>
 902      The <function>pg_stop_backup</function> will return one row with three
 903      values. The second of these fields should be written to a file named
 904      <filename>backup_label</filename> in the root directory of the backup. The
 905      third field should be written to a file named
 906      <filename>tablespace_map</filename> unless the field is empty. These files are
 907      vital to the backup working, and must be written without modification.
 908     </para>
 909    </listitem>
 910    <listitem>
 911     <para>
 912      Once the WAL segment files active during the backup are archived, you are
 913      done.  The file identified by <function>pg_stop_backup</function>'s first return
 914      value is the last segment that is required to form a complete set of
 915      backup files.  On a primary, if <varname>archive_mode</varname> is enabled and the
 916      <literal>wait_for_archive</literal> parameter is <literal>true</literal>,
 917      <function>pg_stop_backup</function> does not return until the last segment has
 918      been archived.
 919      On a standby, <varname>archive_mode</varname> must be <literal>always</literal> in order
 920      for <function>pg_stop_backup</function> to wait.
 921      Archiving of these files happens automatically since you have
 922      already configured <varname>archive_command</varname>. In most cases this
 923      happens quickly, but you are advised to monitor your archive
 924      system to ensure there are no delays.
 925      If the archive process has fallen behind
 926      because of failures of the archive command, it will keep retrying
 927      until the archive succeeds and the backup is complete.
 928      If you wish to place a time limit on the execution of
 929      <function>pg_stop_backup</function>, set an appropriate
 930      <varname>statement_timeout</varname> value, but make note that if
 931      <function>pg_stop_backup</function> terminates because of this your backup
 932      may not be valid.
 933     </para>
 934     <para>
 935      If the backup process monitors and ensures that all WAL segment files
 936      required for the backup are successfully archived then the
 937      <literal>wait_for_archive</literal> parameter (which defaults to true) can be set
 938      to false to have
 939      <function>pg_stop_backup</function> return as soon as the stop backup record is
 940      written to the WAL.  By default, <function>pg_stop_backup</function> will wait
 941      until all WAL has been archived, which can take some time.  This option
 942      must be used with caution: if WAL archiving is not monitored correctly
 943      then the backup might not include all of the WAL files and will
 944      therefore be incomplete and not able to be restored.
 945     </para>
 946    </listitem>
 947   </orderedlist>
 948     </para>
 949    </sect3>
 950    <sect3 id="backup-lowlevel-base-backup-exclusive">
 951     <title>Making an Exclusive Low-Level Backup</title>
 952
 953     <note>
 954      <para>
 955       The exclusive backup method is deprecated and should be avoided.
 956       Prior to <productname>PostgreSQL</productname> 9.6, this was the only
 957       low-level method available, but it is now recommended that all users
 958       upgrade their scripts to use non-exclusive backups.
 959      </para>
 960     </note>
 961
 962     <para>
 963      The process for an exclusive backup is mostly the same as for a
 964      non-exclusive one, but it differs in a few key steps. This type of
 965      backup can only be taken on a primary and does not allow concurrent
 966      backups.  Moreover, because it creates a backup label file, as
 967      described below, it can block automatic restart of the master server
 968      after a crash.  On the other hand, the erroneous removal of this
 969      file from a backup or standby is a common mistake, which can result
 970      in serious data corruption.  If it is necessary to use this method,
 971      the following steps may be used.
 972     </para>
 973     <para>
 974   <orderedlist>
 975    <listitem>
 976     <para>
 977      Ensure that WAL archiving is enabled and working.
 978     </para>
 979    </listitem>
 980    <listitem>
 981     <para>
 982      Connect to the server (it does not matter which database) as a user with
 983      rights to run pg_start_backup (superuser, or a user who has been granted
 984      EXECUTE on the function) and issue the command:
 985 <programlisting>
 986 SELECT pg_start_backup('label');
 987 </programlisting>
 988      where <literal>label</literal> is any string you want to use to uniquely
 989      identify this backup operation.
 990      <function>pg_start_backup</function> creates a <firstterm>backup label</firstterm> file,
 991      called <filename>backup_label</filename>, in the cluster directory with
 992      information about your backup, including the start time and label string.
 993      The function also creates a <firstterm>tablespace map</firstterm> file,
 994      called <filename>tablespace_map</filename>, in the cluster directory with
 995      information about tablespace symbolic links in <filename>pg_tblspc/</filename> if
 996      one or more such link is present.  Both files are critical to the
 997      integrity of the backup, should you need to restore from it.
 998     </para>
 999
1000     <para>
1001      By default, <function>pg_start_backup</function> can take a long time to finish.
1002      This is because it performs a checkpoint, and the I/O
1003      required for the checkpoint will be spread out over a significant
1004      period of time, by default half your inter-checkpoint interval
1005      (see the configuration parameter
1006      <xref linkend="guc-checkpoint-completion-target"/>).  This is
1007      usually what you want, because it minimizes the impact on query
1008      processing.  If you want to start the backup as soon as
1009      possible, use:
1010 <programlisting>
1011 SELECT pg_start_backup('label', true);
1012 </programlisting>
1013      This forces the checkpoint to be done as quickly as possible.
1014     </para>
1015    </listitem>
1016    <listitem>
1017     <para>
1018      Perform the backup, using any convenient file-system-backup tool
1019      such as <application>tar</application> or <application>cpio</application> (not
1020      <application>pg_dump</application> or
1021      <application>pg_dumpall</application>).  It is neither
1022      necessary nor desirable to stop normal operation of the database
1023      while you do this. See
1024      <xref linkend="backup-lowlevel-base-backup-data"/> for things to
1025      consider during this backup.
1026     </para>
1027     <para>
1028      As noted above, if the server crashes during the backup it may not be
1029      possible to restart until the <filename>backup_label</filename> file has
1030      been manually deleted from the <envar>PGDATA</envar> directory.  Note
1031      that it is very important to never remove the
1032      <filename>backup_label</filename> file when restoring a backup, because
1033      this will result in corruption.  Confusion about when it is appropriate
1034      to remove this file is a common cause of data corruption when using this
1035      method; be very certain that you remove the file only on an existing
1036      master and never when building a standby or restoring a backup, even if
1037      you are building a standby that will subsequently be promoted to a new
1038      master.
1039     </para>
1040    </listitem>
1041    <listitem>
1042     <para>
1043      Again connect to the database as a user with rights to run
1044      pg_stop_backup (superuser, or a user who has been granted EXECUTE on
1045      the function), and issue the command:
1046 <programlisting>
1047 SELECT pg_stop_backup();
1048 </programlisting>
1049      This function terminates backup mode and
1050      performs an automatic switch to the next WAL segment. The reason for the
1051      switch is to arrange for the last WAL segment written during the backup
1052      interval to be ready to archive.
1053     </para>
1054    </listitem>
1055    <listitem>
1056     <para>
1057      Once the WAL segment files active during the backup are archived, you are
1058      done.  The file identified by <function>pg_stop_backup</function>'s result is
1059      the last segment that is required to form a complete set of backup files.
1060      If <varname>archive_mode</varname> is enabled,
1061      <function>pg_stop_backup</function> does not return until the last segment has
1062      been archived.
1063      Archiving of these files happens automatically since you have
1064      already configured <varname>archive_command</varname>. In most cases this
1065      happens quickly, but you are advised to monitor your archive
1066      system to ensure there are no delays.
1067      If the archive process has fallen behind
1068      because of failures of the archive command, it will keep retrying
1069      until the archive succeeds and the backup is complete.
1070     </para>
1071
1072     <para>
1073      When using exclusive backup mode, it is absolutely imperative to ensure
1074      that <function>pg_stop_backup</function> completes successfully at the
1075      end of the backup.  Even if the backup itself fails, for example due to
1076      lack of disk space, failure to call <function>pg_stop_backup</function>
1077      will leave the server in backup mode indefinitely, causing future backups
1078      to fail and increasing the risk of a restart failure during the time that
1079      <filename>backup_label</filename> exists.
1080     </para>
1081    </listitem>
1082   </orderedlist>
1083     </para>
1084    </sect3>
1085    <sect3 id="backup-lowlevel-base-backup-data">
1086    <title>Backing Up the Data Directory</title>
1087    <para>
1088     Some file system backup tools emit warnings or errors
1089     if the files they are trying to copy change while the copy proceeds.
1090     When taking a base backup of an active database, this situation is normal
1091     and not an error.  However, you need to ensure that you can distinguish
1092     complaints of this sort from real errors.  For example, some versions
1093     of <application>rsync</application> return a separate exit code for
1094     <quote>vanished source files</quote>, and you can write a driver script to
1095     accept this exit code as a non-error case.  Also, some versions of
1096     GNU <application>tar</application> return an error code indistinguishable from
1097     a fatal error if a file was truncated while <application>tar</application> was
1098     copying it.  Fortunately, GNU <application>tar</application> versions 1.16 and
1099     later exit with 1 if a file was changed during the backup,
1100     and 2 for other errors.  With GNU <application>tar</application> version 1.23 and
1101     later, you can use the warning options <literal>--warning=no-file-changed
1102     --warning=no-file-removed</literal> to hide the related warning messages.
1103    </para>
1104
1105    <para>
1106     Be certain that your backup includes all of the files under
1107     the database cluster directory (e.g., <filename>/usr/local/pgsql/data</filename>).
1108     If you are using tablespaces that do not reside underneath this directory,
1109     be careful to include them as well (and be sure that your backup
1110     archives symbolic links as links, otherwise the restore will corrupt
1111     your tablespaces).
1112    </para>
1113
1114    <para>
1115     You should, however, omit from the backup the files within the
1116     cluster's <filename>pg_wal/</filename> subdirectory.  This
1117     slight adjustment is worthwhile because it reduces the risk
1118     of mistakes when restoring.  This is easy to arrange if
1119     <filename>pg_wal/</filename> is a symbolic link pointing to someplace outside
1120     the cluster directory, which is a common setup anyway for performance
1121     reasons.  You might also want to exclude <filename>postmaster.pid</filename>
1122     and <filename>postmaster.opts</filename>, which record information
1123     about the running <application>postmaster</application>, not about the
1124     <application>postmaster</application> which will eventually use this backup.
1125     (These files can confuse <application>pg_ctl</application>.)
1126    </para>
1127
1128    <para>
1129     It is often a good idea to also omit from the backup the files
1130     within the cluster's <filename>pg_replslot/</filename> directory, so that
1131     replication slots that exist on the master do not become part of the
1132     backup.  Otherwise, the subsequent use of the backup to create a standby
1133     may result in indefinite retention of WAL files on the standby, and
1134     possibly bloat on the master if hot standby feedback is enabled, because
1135     the clients that are using those replication slots will still be connecting
1136     to and updating the slots on the master, not the standby.  Even if the
1137     backup is only intended for use in creating a new master, copying the
1138     replication slots isn't expected to be particularly useful, since the
1139     contents of those slots will likely be badly out of date by the time
1140     the new master comes on line.
1141    </para>
1142
1143    <para>
1144     The contents of the directories <filename>pg_dynshmem/</filename>,
1145     <filename>pg_notify/</filename>, <filename>pg_serial/</filename>,
1146     <filename>pg_snapshots/</filename>, <filename>pg_stat_tmp/</filename>,
1147     and <filename>pg_subtrans/</filename> (but not the directories themselves) can be
1148     omitted from the backup as they will be initialized on postmaster startup.
1149     If <xref linkend="guc-stats-temp-directory"/> is set and is under the data
1150     directory then the contents of that directory can also be omitted.
1151    </para>
1152
1153    <para>
1154     Any file or directory beginning with <filename>pgsql_tmp</filename> can be
1155     omitted from the backup.  These files are removed on postmaster start and
1156     the directories will be recreated as needed.
1157    </para>
1158
1159    <para>
1160     <filename>pg_internal.init</filename> files can be omitted from the
1161     backup whenever a file of that name is found.  These files contain
1162     relation cache data that is always rebuilt when recovering.
1163    </para>
1164
1165    <para>
1166     The backup label
1167     file includes the label string you gave to <function>pg_start_backup</function>,
1168     as well as the time at which <function>pg_start_backup</function> was run, and
1169     the name of the starting WAL file.  In case of confusion it is therefore
1170     possible to look inside a backup file and determine exactly which
1171     backup session the dump file came from.  The tablespace map file includes
1172     the symbolic link names as they exist in the directory
1173     <filename>pg_tblspc/</filename> and the full path of each symbolic link.
1174     These files are not merely for your information; their presence and
1175     contents are critical to the proper operation of the system's recovery
1176     process.
1177    </para>
1178
1179    <para>
1180     It is also possible to make a backup while the server is
1181     stopped.  In this case, you obviously cannot use
1182     <function>pg_start_backup</function> or <function>pg_stop_backup</function>, and
1183     you will therefore be left to your own devices to keep track of which
1184     backup is which and how far back the associated WAL files go.
1185     It is generally better to follow the continuous archiving procedure above.
1186    </para>
1187    </sect3>
1188   </sect2>
1189
1190   <sect2 id="backup-pitr-recovery">
1191    <title>Recovering Using a Continuous Archive Backup</title>
1192
1193    <para>
1194     Okay, the worst has happened and you need to recover from your backup.
1195     Here is the procedure:
1196   <orderedlist>
1197    <listitem>
1198     <para>
1199      Stop the server, if it's running.
1200     </para>
1201    </listitem>
1202    <listitem>
1203     <para>
1204      If you have the space to do so,
1205      copy the whole cluster data directory and any tablespaces to a temporary
1206      location in case you need them later. Note that this precaution will
1207      require that you have enough free space on your system to hold two
1208      copies of your existing database. If you do not have enough space,
1209      you should at least save the contents of the cluster's <filename>pg_wal</filename>
1210      subdirectory, as it might contain logs which
1211      were not archived before the system went down.
1212     </para>
1213    </listitem>
1214    <listitem>
1215     <para>
1216      Remove all existing files and subdirectories under the cluster data
1217      directory and under the root directories of any tablespaces you are using.
1218     </para>
1219    </listitem>
1220    <listitem>
1221     <para>
1222      Restore the database files from your file system backup.  Be sure that they
1223      are restored with the right ownership (the database system user, not
1224      <literal>root</literal>!) and with the right permissions.  If you are using
1225      tablespaces,
1226      you should verify that the symbolic links in <filename>pg_tblspc/</filename>
1227      were correctly restored.
1228     </para>
1229    </listitem>
1230    <listitem>
1231     <para>
1232      Remove any files present in <filename>pg_wal/</filename>; these came from the
1233      file system backup and are therefore probably obsolete rather than current.
1234      If you didn't archive <filename>pg_wal/</filename> at all, then recreate
1235      it with proper permissions,
1236      being careful to ensure that you re-establish it as a symbolic link
1237      if you had it set up that way before.
1238     </para>
1239    </listitem>
1240    <listitem>
1241     <para>
1242      If you have unarchived WAL segment files that you saved in step 2,
1243      copy them into <filename>pg_wal/</filename>.  (It is best to copy them,
1244      not move them, so you still have the unmodified files if a
1245      problem occurs and you have to start over.)
1246     </para>
1247    </listitem>
1248    <listitem>
1249     <para>
1250      Set recovery configuration settings in
1251      <filename>postgresql.conf</filename> (see <xref
1252      linkend="runtime-config-wal-archive-recovery"/>) and create a file
1253      <filename>recovery.signal</filename> in the cluster
1254      data directory. You might
1255      also want to temporarily modify <filename>pg_hba.conf</filename> to prevent
1256      ordinary users from connecting until you are sure the recovery was successful.
1257     </para>
1258    </listitem>
1259    <listitem>
1260     <para>
1261      Start the server.  The server will go into recovery mode and
1262      proceed to read through the archived WAL files it needs.  Should the
1263      recovery be terminated because of an external error, the server can
1264      simply be restarted and it will continue recovery.  Upon completion
1265      of the recovery process, the server will remove
1266      <filename>recovery.signal</filename> (to prevent
1267      accidentally re-entering recovery mode later) and then
1268      commence normal database operations.
1269     </para>
1270    </listitem>
1271    <listitem>
1272     <para>
1273      Inspect the contents of the database to ensure you have recovered to
1274      the desired state.  If not, return to step 1.  If all is well,
1275      allow your users to connect by restoring <filename>pg_hba.conf</filename> to normal.
1276     </para>
1277    </listitem>
1278   </orderedlist>
1279    </para>
1280
1281    <para>
1282     The key part of all this is to set up a recovery configuration that
1283     describes how you want to recover and how far the recovery should
1284     run.  The one thing that you absolutely must specify is the <varname>restore_command</varname>,
1285     which tells <productname>PostgreSQL</productname> how to retrieve archived
1286     WAL file segments.  Like the <varname>archive_command</varname>, this is
1287     a shell command string.  It can contain <literal>%f</literal>, which is
1288     replaced by the name of the desired log file, and <literal>%p</literal>,
1289     which is replaced by the path name to copy the log file to.
1290     (The path name is relative to the current working directory,
1291     i.e., the cluster's data directory.)
1292     Write <literal>%%</literal> if you need to embed an actual <literal>%</literal>
1293     character in the command.  The simplest useful command is
1294     something like:
1295 <programlisting>
1296 restore_command = 'cp /mnt/server/archivedir/%f %p'
1297 </programlisting>
1298     which will copy previously archived WAL segments from the directory
1299     <filename>/mnt/server/archivedir</filename>.  Of course, you can use something
1300     much more complicated, perhaps even a shell script that requests the
1301     operator to mount an appropriate tape.
1302    </para>
1303
1304    <para>
1305     It is important that the command return nonzero exit status on failure.
1306     The command <emphasis>will</emphasis> be called requesting files that are not
1307     present in the archive; it must return nonzero when so asked.  This is not
1308     an error condition.  An exception is that if the command was terminated by
1309     a signal (other than <systemitem>SIGTERM</systemitem>, which is used as
1310     part of a database server shutdown) or an error by the shell (such as
1311     command not found), then recovery will abort and the server will not start
1312     up.
1313    </para>
1314
1315    <para>
1316     Not all of the requested files will be WAL segment
1317     files; you should also expect requests for files with a suffix of
1318     <literal>.history</literal>. Also be aware that
1319     the base name of the <literal>%p</literal> path will be different from
1320     <literal>%f</literal>; do not expect them to be interchangeable.
1321    </para>
1322
1323    <para>
1324     WAL segments that cannot be found in the archive will be sought in
1325     <filename>pg_wal/</filename>; this allows use of recent un-archived segments.
1326     However, segments that are available from the archive will be used in
1327     preference to files in <filename>pg_wal/</filename>.
1328    </para>
1329
1330    <para>
1331     Normally, recovery will proceed through all available WAL segments,
1332     thereby restoring the database to the current point in time (or as
1333     close as possible given the available WAL segments).  Therefore, a normal
1334     recovery will end with a <quote>file not found</quote> message, the exact text
1335     of the error message depending upon your choice of
1336     <varname>restore_command</varname>.  You may also see an error message
1337     at the start of recovery for a file named something like
1338     <filename>00000001.history</filename>.  This is also normal and does not
1339     indicate a problem in simple recovery situations; see
1340     <xref linkend="backup-timelines"/> for discussion.
1341    </para>
1342
1343    <para>
1344     If you want to recover to some previous point in time (say, right before
1345     the junior DBA dropped your main transaction table), just specify the
1346     required <link linkend="runtime-config-wal-recovery-target">stopping point</link>.  You can specify
1347     the stop point, known as the <quote>recovery target</quote>, either by
1348     date/time, named restore point or by completion of a specific transaction
1349     ID.  As of this writing only the date/time and named restore point options
1350     are very usable, since there are no tools to help you identify with any
1351     accuracy which transaction ID to use.
1352    </para>
1353
1354    <note>
1355      <para>
1356       The stop point must be after the ending time of the base backup, i.e.,
1357       the end time of <function>pg_stop_backup</function>.  You cannot use a base backup
1358       to recover to a time when that backup was in progress.  (To
1359       recover to such a time, you must go back to your previous base backup
1360       and roll forward from there.)
1361      </para>
1362    </note>
1363
1364    <para>
1365     If recovery finds corrupted WAL data, recovery will
1366     halt at that point and the server will not start. In such a case the
1367     recovery process could be re-run from the beginning, specifying a
1368     <quote>recovery target</quote> before the point of corruption so that recovery
1369     can complete normally.
1370     If recovery fails for an external reason, such as a system crash or
1371     if the WAL archive has become inaccessible, then the recovery can simply
1372     be restarted and it will restart almost from where it failed.
1373     Recovery restart works much like checkpointing in normal operation:
1374     the server periodically forces all its state to disk, and then updates
1375     the <filename>pg_control</filename> file to indicate that the already-processed
1376     WAL data need not be scanned again.
1377    </para>
1378
1379   </sect2>
1380
1381   <sect2 id="backup-timelines">
1382    <title>Timelines</title>
1383
1384   <indexterm zone="backup">
1385    <primary>timelines</primary>
1386   </indexterm>
1387
1388    <para>
1389     The ability to restore the database to a previous point in time creates
1390     some complexities that are akin to science-fiction stories about time
1391     travel and parallel universes.  For example, in the original history of the database,
1392     suppose you dropped a critical table at 5:15PM on Tuesday evening, but
1393     didn't realize your mistake until Wednesday noon.
1394     Unfazed, you get out your backup, restore to the point-in-time 5:14PM
1395     Tuesday evening, and are up and running.  In <emphasis>this</emphasis> history of
1396     the database universe, you never dropped the table.  But suppose
1397     you later realize this wasn't such a great idea, and would like
1398     to return to sometime Wednesday morning in the original history.
1399     You won't be able
1400     to if, while your database was up-and-running, it overwrote some of the
1401     WAL segment files that led up to the time you now wish you
1402     could get back to.  Thus, to avoid this, you need to distinguish the series of
1403     WAL records generated after you've done a point-in-time recovery from
1404     those that were generated in the original database history.
1405    </para>
1406
1407    <para>
1408     To deal with this problem, <productname>PostgreSQL</productname> has a notion
1409     of <firstterm>timelines</firstterm>.  Whenever an archive recovery completes,
1410     a new timeline is created to identify the series of WAL records
1411     generated after that recovery.  The timeline
1412     ID number is part of WAL segment file names so a new timeline does
1413     not overwrite the WAL data generated by previous timelines.  It is
1414     in fact possible to archive many different timelines.  While that might
1415     seem like a useless feature, it's often a lifesaver.  Consider the
1416     situation where you aren't quite sure what point-in-time to recover to,
1417     and so have to do several point-in-time recoveries by trial and error
1418     until you find the best place to branch off from the old history.  Without
1419     timelines this process would soon generate an unmanageable mess.  With
1420     timelines, you can recover to <emphasis>any</emphasis> prior state, including
1421     states in timeline branches that you abandoned earlier.
1422    </para>
1423
1424    <para>
1425     Every time a new timeline is created, <productname>PostgreSQL</productname> creates
1426     a <quote>timeline history</quote> file that shows which timeline it branched
1427     off from and when.  These history files are necessary to allow the system
1428     to pick the right WAL segment files when recovering from an archive that
1429     contains multiple timelines.  Therefore, they are archived into the WAL
1430     archive area just like WAL segment files.  The history files are just
1431     small text files, so it's cheap and appropriate to keep them around
1432     indefinitely (unlike the segment files which are large).  You can, if
1433     you like, add comments to a history file to record your own notes about
1434     how and why this particular timeline was created.  Such comments will be
1435     especially valuable when you have a thicket of different timelines as
1436     a result of experimentation.
1437    </para>
1438
1439    <para>
1440     The default behavior of recovery is to recover along the same timeline
1441     that was current when the base backup was taken.  If you wish to recover
1442     into some child timeline (that is, you want to return to some state that
1443     was itself generated after a recovery attempt), you need to specify the
1444     target timeline ID in <xref linkend="guc-recovery-target-timeline"/>. You cannot recover into
1445     timelines that branched off earlier than the base backup.
1446    </para>
1447   </sect2>
1448
1449   <sect2 id="backup-tips">
1450    <title>Tips and Examples</title>
1451
1452    <para>
1453     Some tips for configuring continuous archiving are given here.
1454    </para>
1455
1456     <sect3 id="backup-standalone">
1457      <title>Standalone Hot Backups</title>
1458
1459      <para>
1460       It is possible to use <productname>PostgreSQL</productname>'s backup facilities to
1461       produce standalone hot backups. These are backups that cannot be used
1462       for point-in-time recovery, yet are typically much faster to backup and
1463       restore than <application>pg_dump</application> dumps.  (They are also much larger
1464       than <application>pg_dump</application> dumps, so in some cases the speed advantage
1465       might be negated.)
1466      </para>
1467
1468      <para>
1469       As with base backups, the easiest way to produce a standalone
1470       hot backup is to use the <xref linkend="app-pgbasebackup"/>
1471       tool. If you include the <literal>-X</literal> parameter when calling
1472       it, all the write-ahead log required to use the backup will be
1473       included in the backup automatically, and no special action is
1474       required to restore the backup.
1475      </para>
1476
1477      <para>
1478       If more flexibility in copying the backup files is needed, a lower
1479       level process can be used for standalone hot backups as well.
1480       To prepare for low level standalone hot backups, make sure
1481       <varname>wal_level</varname> is set to
1482       <literal>replica</literal> or higher, <varname>archive_mode</varname> to
1483       <literal>on</literal>, and set up an <varname>archive_command</varname> that performs
1484       archiving only when a <emphasis>switch file</emphasis> exists.  For example:
1485 <programlisting>
1486 archive_command = 'test ! -f /var/lib/pgsql/backup_in_progress || (test ! -f /var/lib/pgsql/archive/%f &amp;&amp; cp %p /var/lib/pgsql/archive/%f)'
1487 </programlisting>
1488       This command will perform archiving when
1489       <filename>/var/lib/pgsql/backup_in_progress</filename> exists, and otherwise
1490       silently return zero exit status (allowing <productname>PostgreSQL</productname>
1491       to recycle the unwanted WAL file).
1492      </para>
1493
1494      <para>
1495       With this preparation, a backup can be taken using a script like the
1496       following:
1497 <programlisting>
1498 touch /var/lib/pgsql/backup_in_progress
1499 psql -c "select pg_start_backup('hot_backup');"
1500 tar -cf /var/lib/pgsql/backup.tar /var/lib/pgsql/data/
1501 psql -c "select pg_stop_backup();"
1502 rm /var/lib/pgsql/backup_in_progress
1503 tar -rf /var/lib/pgsql/backup.tar /var/lib/pgsql/archive/
1504 </programlisting>
1505       The switch file <filename>/var/lib/pgsql/backup_in_progress</filename> is
1506       created first, enabling archiving of completed WAL files to occur.
1507       After the backup the switch file is removed. Archived WAL files are
1508       then added to the backup so that both base backup and all required
1509       WAL files are part of the same <application>tar</application> file.
1510       Please remember to add error handling to your backup scripts.
1511      </para>
1512
1513     </sect3>
1514
1515     <sect3 id="compressed-archive-logs">
1516      <title>Compressed Archive Logs</title>
1517
1518      <para>
1519       If archive storage size is a concern, you can use
1520       <application>gzip</application> to compress the archive files:
1521 <programlisting>
1522 archive_command = 'gzip &lt; %p &gt; /var/lib/pgsql/archive/%f'
1523 </programlisting>
1524       You will then need to use <application>gunzip</application> during recovery:
1525 <programlisting>
1526 restore_command = 'gunzip &lt; /mnt/server/archivedir/%f &gt; %p'
1527 </programlisting>
1528      </para>
1529     </sect3>
1530
1531     <sect3 id="backup-scripts">
1532      <title><varname>archive_command</varname> Scripts</title>
1533
1534      <para>
1535       Many people choose to use scripts to define their
1536       <varname>archive_command</varname>, so that their
1537       <filename>postgresql.conf</filename> entry looks very simple:
1538 <programlisting>
1539 archive_command = 'local_backup_script.sh "%p" "%f"'
1540 </programlisting>
1541       Using a separate script file is advisable any time you want to use
1542       more than a single command in the archiving process.
1543       This allows all complexity to be managed within the script, which
1544       can be written in a popular scripting language such as
1545       <application>bash</application> or <application>perl</application>.
1546      </para>
1547
1548      <para>
1549       Examples of requirements that might be solved within a script include:
1550       <itemizedlist>
1551        <listitem>
1552         <para>
1553          Copying data to secure off-site data storage
1554         </para>
1555        </listitem>
1556        <listitem>
1557         <para>
1558          Batching WAL files so that they are transferred every three hours,
1559          rather than one at a time
1560         </para>
1561        </listitem>
1562        <listitem>
1563         <para>
1564          Interfacing with other backup and recovery software
1565         </para>
1566        </listitem>
1567        <listitem>
1568         <para>
1569          Interfacing with monitoring software to report errors
1570         </para>
1571        </listitem>
1572       </itemizedlist>
1573      </para>
1574
1575      <tip>
1576       <para>
1577        When using an <varname>archive_command</varname> script, it's desirable
1578        to enable <xref linkend="guc-logging-collector"/>.
1579        Any messages written to <systemitem>stderr</systemitem> from the script will then
1580        appear in the database server log, allowing complex configurations to
1581        be diagnosed easily if they fail.
1582       </para>
1583      </tip>
1584     </sect3>
1585   </sect2>
1586
1587   <sect2 id="continuous-archiving-caveats">
1588    <title>Caveats</title>
1589
1590    <para>
1591     At this writing, there are several limitations of the continuous archiving
1592     technique.  These will probably be fixed in future releases:
1593
1594   <itemizedlist>
1595    <listitem>
1596     <para>
1597      If a <xref linkend="sql-createdatabase"/>
1598      command is executed while a base backup is being taken, and then
1599      the template database that the <command>CREATE DATABASE</command> copied
1600      is modified while the base backup is still in progress, it is
1601      possible that recovery will cause those modifications to be
1602      propagated into the created database as well.  This is of course
1603      undesirable.  To avoid this risk, it is best not to modify any
1604      template databases while taking a base backup.
1605     </para>
1606    </listitem>
1607
1608    <listitem>
1609     <para>
1610      <xref linkend="sql-createtablespace"/>
1611      commands are WAL-logged with the literal absolute path, and will
1612      therefore be replayed as tablespace creations with the same
1613      absolute path.  This might be undesirable if the log is being
1614      replayed on a different machine.  It can be dangerous even if the
1615      log is being replayed on the same machine, but into a new data
1616      directory: the replay will still overwrite the contents of the
1617      original tablespace.  To avoid potential gotchas of this sort,
1618      the best practice is to take a new base backup after creating or
1619      dropping tablespaces.
1620     </para>
1621    </listitem>
1622   </itemizedlist>
1623    </para>
1624
1625    <para>
1626     It should also be noted that the default <acronym>WAL</acronym>
1627     format is fairly bulky since it includes many disk page snapshots.
1628     These page snapshots are designed to support crash recovery, since
1629     we might need to fix partially-written disk pages.  Depending on
1630     your system hardware and software, the risk of partial writes might
1631     be small enough to ignore, in which case you can significantly
1632     reduce the total volume of archived logs by turning off page
1633     snapshots using the <xref linkend="guc-full-page-writes"/>
1634     parameter.  (Read the notes and warnings in <xref linkend="wal"/>
1635     before you do so.)  Turning off page snapshots does not prevent
1636     use of the logs for PITR operations.  An area for future
1637     development is to compress archived WAL data by removing
1638     unnecessary page copies even when <varname>full_page_writes</varname> is
1639     on.  In the meantime, administrators might wish to reduce the number
1640     of page snapshots included in WAL by increasing the checkpoint
1641     interval parameters as much as feasible.
1642    </para>
1643   </sect2>
1644  </sect1>
1645
1646 </chapter>