1 <!-- doc/src/sgml/backup.sgml -->
4 <title>Backup and Restore</title>
6 <indexterm zone="backup"><primary>backup</></>
9 As with everything that contains valuable data, <productname>PostgreSQL</>
10 databases should be backed up regularly. While the procedure is
11 essentially simple, it is important to have a clear understanding of
12 the underlying techniques and assumptions.
16 There are three fundamentally different approaches to backing up
17 <productname>PostgreSQL</> data:
19 <listitem><para><acronym>SQL</> dump</para></listitem>
20 <listitem><para>File system level backup</para></listitem>
21 <listitem><para>Continuous archiving</para></listitem>
23 Each has its own strengths and weaknesses; each is discussed in turn
24 in the following sections.
27 <sect1 id="backup-dump">
28 <title><acronym>SQL</> Dump</title>
31 The idea behind this dump method is to generate a text file with SQL
32 commands that, when fed back to the server, will recreate the
33 database in the same state as it was at the time of the dump.
34 <productname>PostgreSQL</> provides the utility program
35 <xref linkend="app-pgdump"> for this purpose. The basic usage of this
38 pg_dump <replaceable class="parameter">dbname</replaceable> > <replaceable class="parameter">outfile</replaceable>
40 As you see, <application>pg_dump</> writes its result to the
41 standard output. We will see below how this can be useful.
45 <application>pg_dump</> is a regular <productname>PostgreSQL</>
46 client application (albeit a particularly clever one). This means
47 that you can perform this backup procedure from any remote host that has
48 access to the database. But remember that <application>pg_dump</>
49 does not operate with special permissions. In particular, it must
50 have read access to all tables that you want to back up, so in
51 practice you almost always have to run it as a database superuser.
55 To specify which database server <application>pg_dump</> should
56 contact, use the command line options <option>-h
57 <replaceable>host</></> and <option>-p <replaceable>port</></>. The
58 default host is the local host or whatever your
59 <envar>PGHOST</envar> environment variable specifies. Similarly,
60 the default port is indicated by the <envar>PGPORT</envar>
61 environment variable or, failing that, by the compiled-in default.
62 (Conveniently, the server will normally have the same compiled-in
67 Like any other <productname>PostgreSQL</> client application,
68 <application>pg_dump</> will by default connect with the database
69 user name that is equal to the current operating system user name. To override
70 this, either specify the <option>-U</option> option or set the
71 environment variable <envar>PGUSER</envar>. Remember that
72 <application>pg_dump</> connections are subject to the normal
73 client authentication mechanisms (which are described in <xref
74 linkend="client-authentication">).
78 An important advantage of <application>pg_dump</> over the other backup
79 methods described later is that <application>pg_dump</>'s output can
80 generally be re-loaded into newer versions of <productname>PostgreSQL</>,
81 whereas file-level backups and continuous archiving are both extremely
82 server-version-specific. <application>pg_dump</> is also the only method
83 that will work when transferring a database to a different machine
84 architecture, such as going from a 32-bit to a 64-bit server.
88 Dumps created by <application>pg_dump</> are internally consistent,
89 meaning, the dump represents a snapshot of the database at the time
90 <application>pg_dump</> began running. <application>pg_dump</> does not
91 block other operations on the database while it is working.
92 (Exceptions are those operations that need to operate with an
93 exclusive lock, such as most forms of <command>ALTER TABLE</command>.)
98 If your database schema relies on OIDs (for instance, as foreign
99 keys) you must instruct <application>pg_dump</> to dump the OIDs
100 as well. To do this, use the <option>-o</option> command-line
105 <sect2 id="backup-dump-restore">
106 <title>Restoring the dump</title>
109 The text files created by <application>pg_dump</> are intended to
110 be read in by the <application>psql</application> program. The
111 general command form to restore a dump is
113 psql <replaceable class="parameter">dbname</replaceable> < <replaceable class="parameter">infile</replaceable>
115 where <replaceable class="parameter">infile</replaceable> is the
116 file output by the <application>pg_dump</> command. The database <replaceable
117 class="parameter">dbname</replaceable> will not be created by this
118 command, so you must create it yourself from <literal>template0</>
119 before executing <application>psql</> (e.g., with
120 <literal>createdb -T template0 <replaceable
121 class="parameter">dbname</></literal>). <application>psql</>
122 supports options similar to <application>pg_dump</> for specifying
123 the database server to connect to and the user name to use. See
124 the <xref linkend="app-psql"> reference page for more information.
128 Before restoring an SQL dump, all the users who own objects or were
129 granted permissions on objects in the dumped database must already
130 exist. If they do not, the restore will fail to recreate the
131 objects with the original ownership and/or permissions.
132 (Sometimes this is what you want, but usually it is not.)
136 By default, the <application>psql</> script will continue to
137 execute after an SQL error is encountered. You might wish to run
138 <application>psql</application> with
139 the <literal>ON_ERROR_STOP</> variable set to alter that
140 behavior and have <application>psql</application> exit with an
141 exit status of 3 if an SQL error occurs:
143 psql --set ON_ERROR_STOP=on dbname < infile
145 Either way, you will only have a partially restored database.
146 Alternatively, you can specify that the whole dump should be
147 restored as a single transaction, so the restore is either fully
148 completed or fully rolled back. This mode can be specified by
149 passing the <option>-1</> or <option>--single-transaction</>
150 command-line options to <application>psql</>. When using this
151 mode, be aware that even a minor error can rollback a
152 restore that has already run for many hours. However, that might
153 still be preferable to manually cleaning up a complex database
154 after a partially restored dump.
158 The ability of <application>pg_dump</> and <application>psql</> to
159 write to or read from pipes makes it possible to dump a database
160 directly from one server to another, for example:
162 pg_dump -h <replaceable>host1</> <replaceable>dbname</> | psql -h <replaceable>host2</> <replaceable>dbname</>
168 The dumps produced by <application>pg_dump</> are relative to
169 <literal>template0</>. This means that any languages, procedures,
170 etc. added via <literal>template1</> will also be dumped by
171 <application>pg_dump</>. As a result, when restoring, if you are
172 using a customized <literal>template1</>, you must create the
173 empty database from <literal>template0</>, as in the example
179 After restoring a backup, it is wise to run <xref
180 linkend="sql-analyze"> on each
181 database so the query optimizer has useful statistics;
182 see <xref linkend="vacuum-for-statistics">
183 and <xref linkend="autovacuum"> for more information.
184 For more advice on how to load large amounts of data
185 into <productname>PostgreSQL</> efficiently, refer to <xref
190 <sect2 id="backup-dump-all">
191 <title>Using <application>pg_dumpall</></title>
194 <application>pg_dump</> dumps only a single database at a time,
195 and it does not dump information about roles or tablespaces
196 (because those are cluster-wide rather than per-database).
197 To support convenient dumping of the entire contents of a database
198 cluster, the <xref linkend="app-pg-dumpall"> program is provided.
199 <application>pg_dumpall</> backs up each database in a given
200 cluster, and also preserves cluster-wide data such as role and
201 tablespace definitions. The basic usage of this command is:
203 pg_dumpall > <replaceable>outfile</>
205 The resulting dump can be restored with <application>psql</>:
207 psql -f <replaceable class="parameter">infile</replaceable> postgres
209 (Actually, you can specify any existing database name to start from,
210 but if you are loading into an empty cluster then <literal>postgres</>
211 should usually be used.) It is always necessary to have
212 database superuser access when restoring a <application>pg_dumpall</>
213 dump, as that is required to restore the role and tablespace information.
214 If you use tablespaces, make sure that the tablespace paths in the
215 dump are appropriate for the new installation.
219 <application>pg_dumpall</> works by emitting commands to re-create
220 roles, tablespaces, and empty databases, then invoking
221 <application>pg_dump</> for each database. This means that while
222 each database will be internally consistent, the snapshots of
223 different databases might not be exactly in-sync.
227 <sect2 id="backup-dump-large">
228 <title>Handling large databases</title>
231 Some operating systems have maximum file size limits that cause
232 problems when creating large <application>pg_dump</> output files.
233 Fortunately, <application>pg_dump</> can write to the standard
234 output, so you can use standard Unix tools to work around this
235 potential problem. There are several possible methods:
239 <title>Use compressed dumps.</title>
241 You can use your favorite compression program, for example
242 <application>gzip</application>:
245 pg_dump <replaceable class="parameter">dbname</replaceable> | gzip > <replaceable class="parameter">filename</replaceable>.gz
251 gunzip -c <replaceable class="parameter">filename</replaceable>.gz | psql <replaceable class="parameter">dbname</replaceable>
257 cat <replaceable class="parameter">filename</replaceable>.gz | gunzip | psql <replaceable class="parameter">dbname</replaceable>
263 <title>Use <command>split</>.</title>
265 The <command>split</command> command
266 allows you to split the output into smaller files that are
267 acceptable in size to the underlying file system. For example, to
268 make chunks of 1 megabyte:
271 pg_dump <replaceable class="parameter">dbname</replaceable> | split -b 1m - <replaceable class="parameter">filename</replaceable>
277 cat <replaceable class="parameter">filename</replaceable>* | psql <replaceable class="parameter">dbname</replaceable>
283 <title>Use <application>pg_dump</>'s custom dump format.</title>
285 If <productname>PostgreSQL</productname> was built on a system with the
286 <application>zlib</> compression library installed, the custom dump
287 format will compress data as it writes it to the output file. This will
288 produce dump file sizes similar to using <command>gzip</command>, but it
289 has the added advantage that tables can be restored selectively. The
290 following command dumps a database using the custom dump format:
293 pg_dump -Fc <replaceable class="parameter">dbname</replaceable> > <replaceable class="parameter">filename</replaceable>
296 A custom-format dump is not a script for <application>psql</>, but
297 instead must be restored with <application>pg_restore</>, for example:
300 pg_restore -d <replaceable class="parameter">dbname</replaceable> <replaceable class="parameter">filename</replaceable>
303 See the <xref linkend="app-pgdump"> and <xref
304 linkend="app-pgrestore"> reference pages for details.
309 For very large databases, you might need to combine <command>split</>
310 with one of the other two approaches.
316 <sect1 id="backup-file">
317 <title>File System Level Backup</title>
320 An alternative backup strategy is to directly copy the files that
321 <productname>PostgreSQL</> uses to store the data in the database;
322 <xref linkend="creating-cluster"> explains where these files
323 are located. You can use whatever method you prefer
324 for doing file system backups; for example:
327 tar -cf backup.tar /usr/local/pgsql/data
332 There are two restrictions, however, which make this method
333 impractical, or at least inferior to the <application>pg_dump</>
339 The database server <emphasis>must</> be shut down in order to
340 get a usable backup. Half-way measures such as disallowing all
341 connections will <emphasis>not</emphasis> work
342 (in part because <command>tar</command> and similar tools do not take
343 an atomic snapshot of the state of the file system,
344 but also because of internal buffering within the server).
345 Information about stopping the server can be found in
346 <xref linkend="server-shutdown">. Needless to say, you
347 also need to shut down the server before restoring the data.
353 If you have dug into the details of the file system layout of the
354 database, you might be tempted to try to back up or restore only certain
355 individual tables or databases from their respective files or
356 directories. This will <emphasis>not</> work because the
357 information contained in these files is not usable without
358 the commit log files,
359 <filename>pg_clog/*</filename>, which contain the commit status of
360 all transactions. A table file is only usable with this
361 information. Of course it is also impossible to restore only a
362 table and the associated <filename>pg_clog</filename> data
363 because that would render all other tables in the database
364 cluster useless. So file system backups only work for complete
365 backup and restoration of an entire database cluster.
372 An alternative file-system backup approach is to make a
373 <quote>consistent snapshot</quote> of the data directory, if the
374 file system supports that functionality (and you are willing to
375 trust that it is implemented correctly). The typical procedure is
376 to make a <quote>frozen snapshot</> of the volume containing the
377 database, then copy the whole data directory (not just parts, see
378 above) from the snapshot to a backup device, then release the frozen
379 snapshot. This will work even while the database server is running.
380 However, a backup created in this way saves
381 the database files in a state as if the database server was not
382 properly shut down; therefore, when you start the database server
383 on the backed-up data, it will think the previous server instance
384 crashed and will replay the WAL log. This is not a problem; just
385 be aware of it (and be sure to include the WAL files in your backup).
386 You can perform a <command>CHECKPOINT</command> before taking the
387 snapshot to reduce recovery time.
391 If your database is spread across multiple file systems, there might not
392 be any way to obtain exactly-simultaneous frozen snapshots of all
393 the volumes. For example, if your data files and WAL log are on different
394 disks, or if tablespaces are on different file systems, it might
395 not be possible to use snapshot backup because the snapshots
396 <emphasis>must</> be simultaneous.
397 Read your file system documentation very carefully before trusting
398 the consistent-snapshot technique in such situations.
402 If simultaneous snapshots are not possible, one option is to shut down
403 the database server long enough to establish all the frozen snapshots.
404 Another option is perform a continuous archiving base backup (<xref
405 linkend="backup-base-backup">) because such backups are immune to file
406 system changes during the backup. This requires enabling continuous
407 archiving just during the backup process; restore is done using
408 continuous archive recovery (<xref linkend="backup-pitr-recovery">).
412 Another option is to use <application>rsync</> to perform a file
413 system backup. This is done by first running <application>rsync</>
414 while the database server is running, then shutting down the database
415 server just long enough to do a second <application>rsync</>. The
416 second <application>rsync</> will be much quicker than the first,
417 because it has relatively little data to transfer, and the end result
418 will be consistent because the server was down. This method
419 allows a file system backup to be performed with minimal downtime.
423 Note that a file system backup will typically be larger
424 than an SQL dump. (<application>pg_dump</application> does not need to dump
425 the contents of indexes for example, just the commands to recreate
426 them.) However, taking a file system backup might be faster.
430 <sect1 id="continuous-archiving">
431 <title>Continuous Archiving and Point-In-Time Recovery (PITR)</title>
433 <indexterm zone="backup">
434 <primary>continuous archiving</primary>
437 <indexterm zone="backup">
438 <primary>point-in-time recovery</primary>
441 <indexterm zone="backup">
442 <primary>PITR</primary>
446 At all times, <productname>PostgreSQL</> maintains a
447 <firstterm>write ahead log</> (WAL) in the <filename>pg_xlog/</>
448 subdirectory of the cluster's data directory. The log records
449 every change made to the database's data files. This log exists
450 primarily for crash-safety purposes: if the system crashes, the
451 database can be restored to consistency by <quote>replaying</> the
452 log entries made since the last checkpoint. However, the existence
453 of the log makes it possible to use a third strategy for backing up
454 databases: we can combine a file-system-level backup with backup of
455 the WAL files. If recovery is needed, we restore the file system backup and
456 then replay from the backed-up WAL files to bring the system to a
457 current state. This approach is more complex to administer than
458 either of the previous approaches, but it has some significant
463 We do not need a perfectly consistent file system backup as the starting point.
464 Any internal inconsistency in the backup will be corrected by log
465 replay (this is not significantly different from what happens during
466 crash recovery). So we do not need a file system snapshot capability,
467 just <application>tar</> or a similar archiving tool.
472 Since we can combine an indefinitely long sequence of WAL files
473 for replay, continuous backup can be achieved simply by continuing to archive
474 the WAL files. This is particularly valuable for large databases, where
475 it might not be convenient to take a full backup frequently.
480 It is not necessary to replay the WAL entries all the
481 way to the end. We could stop the replay at any point and have a
482 consistent snapshot of the database as it was at that time. Thus,
483 this technique supports <firstterm>point-in-time recovery</>: it is
484 possible to restore the database to its state at any time since your base
490 If we continuously feed the series of WAL files to another
491 machine that has been loaded with the same base backup file, we
492 have a <firstterm>warm standby</> system: at any point we can bring up
493 the second machine and it will have a nearly-current copy of the
502 <application>pg_dump</application> and
503 <application>pg_dumpall</application> do not produce file-system-level
504 backups and cannot be used as part of a continuous-archiving solution.
505 Such dumps are <emphasis>logical</> and do not contain enough
506 information to be used by WAL replay.
511 As with the plain file-system-backup technique, this method can only
512 support restoration of an entire database cluster, not a subset.
513 Also, it requires a lot of archival storage: the base backup might be bulky,
514 and a busy system will generate many megabytes of WAL traffic that
515 have to be archived. Still, it is the preferred backup technique in
516 many situations where high reliability is needed.
520 To recover successfully using continuous archiving (also called
521 <quote>online backup</> by many database vendors), you need a continuous
522 sequence of archived WAL files that extends back at least as far as the
523 start time of your backup. So to get started, you should set up and test
524 your procedure for archiving WAL files <emphasis>before</> you take your
525 first base backup. Accordingly, we first discuss the mechanics of
529 <sect2 id="backup-archiving-wal">
530 <title>Setting up WAL archiving</title>
533 In an abstract sense, a running <productname>PostgreSQL</> system
534 produces an indefinitely long sequence of WAL records. The system
535 physically divides this sequence into WAL <firstterm>segment
536 files</>, which are normally 16MB apiece (although the segment size
537 can be altered when building <productname>PostgreSQL</>). The segment
538 files are given numeric names that reflect their position in the
539 abstract WAL sequence. When not using WAL archiving, the system
540 normally creates just a few segment files and then
541 <quote>recycles</> them by renaming no-longer-needed segment files
542 to higher segment numbers. It's assumed that segment files whose
543 contents precede the checkpoint-before-last are no longer of
544 interest and can be recycled.
548 When archiving WAL data, we need to capture the contents of each segment
549 file once it is filled, and save that data somewhere before the segment
550 file is recycled for reuse. Depending on the application and the
551 available hardware, there could be many different ways of <quote>saving
552 the data somewhere</>: we could copy the segment files to an NFS-mounted
553 directory on another machine, write them onto a tape drive (ensuring that
554 you have a way of identifying the original name of each file), or batch
555 them together and burn them onto CDs, or something else entirely. To
556 provide the database administrator with flexibility,
557 <productname>PostgreSQL</> tries not to make any assumptions about how
558 the archiving will be done. Instead, <productname>PostgreSQL</> lets
559 the administrator specify a shell command to be executed to copy a
560 completed segment file to wherever it needs to go. The command could be
561 as simple as a <literal>cp</>, or it could invoke a complex shell
562 script — it's all up to you.
566 To enable WAL archiving, set the <xref linkend="guc-wal-level">
567 configuration parameter to <literal>archive</> (or <literal>hot_standby</>),
568 <xref linkend="guc-archive-mode"> to <literal>on</>,
569 and specify the shell command to use in the <xref
570 linkend="guc-archive-command"> configuration parameter. In practice
571 these settings will always be placed in the
572 <filename>postgresql.conf</filename> file.
573 In <varname>archive_command</>,
574 <literal>%p</> is replaced by the path name of the file to
575 archive, while <literal>%f</> is replaced by only the file name.
576 (The path name is relative to the current working directory,
577 i.e., the cluster's data directory.)
578 Use <literal>%%</> if you need to embed an actual <literal>%</>
579 character in the command. The simplest useful command is something
582 archive_command = 'cp -i %p /mnt/server/archivedir/%f </dev/null' # Unix
583 archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"' # Windows
585 which will copy archivable WAL segments to the directory
586 <filename>/mnt/server/archivedir</>. (This is an example, not a
587 recommendation, and might not work on all platforms.) After the
588 <literal>%p</> and <literal>%f</> parameters have been replaced,
589 the actual command executed might look like this:
591 cp -i pg_xlog/00000001000000A900000065 /mnt/server/archivedir/00000001000000A900000065 </dev/null
593 A similar command will be generated for each new file to be archived.
597 The archive command will be executed under the ownership of the same
598 user that the <productname>PostgreSQL</> server is running as. Since
599 the series of WAL files being archived contains effectively everything
600 in your database, you will want to be sure that the archived data is
601 protected from prying eyes; for example, archive into a directory that
602 does not have group or world read access.
606 It is important that the archive command return zero exit status if and
607 only if it succeeds. Upon getting a zero result,
608 <productname>PostgreSQL</> will assume that the file has been
609 successfully archived, and will remove or recycle it. However, a nonzero
610 status tells <productname>PostgreSQL</> that the file was not archived;
611 it will try again periodically until it succeeds.
615 The archive command should generally be designed to refuse to overwrite
616 any pre-existing archive file. This is an important safety feature to
617 preserve the integrity of your archive in case of administrator error
618 (such as sending the output of two different servers to the same archive
620 It is advisable to test your proposed archive command to ensure that it
621 indeed does not overwrite an existing file, <emphasis>and that it returns
622 nonzero status in this case</>. On many Unix platforms, <command>cp
623 -i</> causes copy to prompt before overwriting a file, and
624 <literal>< /dev/null</> causes the prompt (and overwriting) to
625 fail. If your platform does not support this behavior, you should
626 add a command to test for the existence of the archive file. For
627 example, something like:
629 archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
631 works correctly on most Unix variants.
635 While designing your archiving setup, consider what will happen if
636 the archive command fails repeatedly because some aspect requires
637 operator intervention or the archive runs out of space. For example, this
638 could occur if you write to tape without an autochanger; when the tape
639 fills, nothing further can be archived until the tape is swapped.
640 You should ensure that any error condition or request to a human operator
641 is reported appropriately so that the situation can be
642 resolved reasonably quickly. The <filename>pg_xlog/</> directory will
643 continue to fill with WAL segment files until the situation is resolved.
644 (If the file system containing <filename>pg_xlog/</> fills up,
645 <productname>PostgreSQL</> will do a PANIC shutdown. No committed
646 transactions will be lost, but the database will remain offline until
647 you free some space.)
651 The speed of the archiving command is unimportant as long as it can keep up
652 with the average rate at which your server generates WAL data. Normal
653 operation continues even if the archiving process falls a little behind.
654 If archiving falls significantly behind, this will increase the amount of
655 data that would be lost in the event of a disaster. It will also mean that
656 the <filename>pg_xlog/</> directory will contain large numbers of
657 not-yet-archived segment files, which could eventually exceed available
658 disk space. You are advised to monitor the archiving process to ensure that
659 it is working as you intend.
663 In writing your archive command, you should assume that the file names to
664 be archived can be up to 64 characters long and can contain any
665 combination of ASCII letters, digits, and dots. It is not necessary to
666 preserve the original relative path (<literal>%p</>) but it is necessary to
667 preserve the file name (<literal>%f</>).
671 Note that although WAL archiving will allow you to restore any
672 modifications made to the data in your <productname>PostgreSQL</> database,
673 it will not restore changes made to configuration files (that is,
674 <filename>postgresql.conf</>, <filename>pg_hba.conf</> and
675 <filename>pg_ident.conf</>), since those are edited manually rather
676 than through SQL operations.
677 You might wish to keep the configuration files in a location that will
678 be backed up by your regular file system backup procedures. See
679 <xref linkend="runtime-config-file-locations"> for how to relocate the
684 The archive command is only invoked on completed WAL segments. Hence,
685 if your server generates only little WAL traffic (or has slack periods
686 where it does so), there could be a long delay between the completion
687 of a transaction and its safe recording in archive storage. To put
688 a limit on how old unarchived data can be, you can set
689 <xref linkend="guc-archive-timeout"> to force the server to switch
690 to a new WAL segment file at least that often. Note that archived
691 files that are archived early due to a forced switch are still the same
692 length as completely full files. It is therefore unwise to set a very
693 short <varname>archive_timeout</> — it will bloat your archive
694 storage. <varname>archive_timeout</> settings of a minute or so are
699 Also, you can force a segment switch manually with
700 <function>pg_switch_xlog</> if you want to ensure that a
701 just-finished transaction is archived as soon as possible. Other utility
702 functions related to WAL management are listed in <xref
703 linkend="functions-admin-backup-table">.
707 When <varname>wal_level</> is <literal>minimal</> some SQL commands
708 are optimized to avoid WAL logging, as described in <xref
709 linkend="populate-pitr">. If archiving or streaming replication were
710 turned on during execution of one of these statements, WAL would not
711 contain enough information for archive recovery. (Crash recovery is
712 unaffected.) For this reason, <varname>wal_level</> can only be changed at
713 server start. However, <varname>archive_command</> can be changed with a
714 configuration file reload. If you wish to temporarily stop archiving,
715 one way to do it is to set <varname>archive_command</> to the empty
716 string (<literal>''</>).
717 This will cause WAL files to accumulate in <filename>pg_xlog/</> until a
718 working <varname>archive_command</> is re-established.
722 <sect2 id="backup-base-backup">
723 <title>Making a Base Backup</title>
726 The procedure for making a base backup is relatively simple:
730 Ensure that WAL archiving is enabled and working.
735 Connect to the database as a superuser and issue the command:
737 SELECT pg_start_backup('label');
739 where <literal>label</> is any string you want to use to uniquely
740 identify this backup operation. (One good practice is to use the
741 full path where you intend to put the backup dump file.)
742 <function>pg_start_backup</> creates a <firstterm>backup label</> file,
743 called <filename>backup_label</>, in the cluster directory with
744 information about your backup, including the start time and label
749 It does not matter which database within the cluster you connect to to
750 issue this command. You can ignore the result returned by the function;
751 but if it reports an error, deal with that before proceeding.
755 By default, <function>pg_start_backup</> can take a long time to finish.
756 This is because it performs a checkpoint, and the I/O
757 required for the checkpoint will be spread out over a significant
758 period of time, by default half your inter-checkpoint interval
759 (see the configuration parameter
760 <xref linkend="guc-checkpoint-completion-target">). This is
761 usually what you want, because it minimizes the impact on query
762 processing. If you want to start the backup as soon as
765 SELECT pg_start_backup('label', true);
767 This forces the checkpoint to be done as quickly as possible.
772 Perform the backup, using any convenient file-system-backup tool
773 such as <application>tar</> or <application>cpio</> (not
774 <application>pg_dump</application> or
775 <application>pg_dumpall</application>). It is neither
776 necessary nor desirable to stop normal operation of the database
782 Again connect to the database as a superuser, and issue the command:
784 SELECT pg_stop_backup();
786 This terminates the backup mode and performs an automatic switch to
787 the next WAL segment. The reason for the switch is to arrange for
788 the last WAL segment file written during the backup interval to be
794 Once the WAL segment files active during the backup are archived, you are
795 done. The file identified by <function>pg_stop_backup</>'s result is
796 the last segment that is required to form a complete set of backup files.
797 If <varname>archive_mode</> is enabled,
798 <function>pg_stop_backup</> does not return until the last segment has
800 Archiving of these files happens automatically since you have
801 already configured <varname>archive_command</>. In most cases this
802 happens quickly, but you are advised to monitor your archive
803 system to ensure there are no delays.
804 If the archive process has fallen behind
805 because of failures of the archive command, it will keep retrying
806 until the archive succeeds and the backup is complete.
807 If you wish to place a time limit on the execution of
808 <function>pg_stop_backup</>, set an appropriate
809 <varname>statement_timeout</varname> value.
816 You can also use the <xref linkend="app-pgbasebackup"> tool to take
817 the backup, instead of manually copying the files. This tool will take
818 care of the <function>pg_start_backup()</>, copy and
819 <function>pg_stop_backup()</> steps automatically, and transfers the
820 backup over a regular <productname>PostgreSQL</productname> connection
821 using the replication protocol, instead of requiring filesystem level
826 Some file system backup tools emit warnings or errors
827 if the files they are trying to copy change while the copy proceeds.
828 When taking a base backup of an active database, this situation is normal
829 and not an error. However, you need to ensure that you can distinguish
830 complaints of this sort from real errors. For example, some versions
831 of <application>rsync</> return a separate exit code for
832 <quote>vanished source files</>, and you can write a driver script to
833 accept this exit code as a non-error case. Also, some versions of
834 GNU <application>tar</> return an error code indistinguishable from
835 a fatal error if a file was truncated while <application>tar</> was
836 copying it. Fortunately, GNU <application>tar</> versions 1.16 and
837 later exit with <literal>1</> if a file was changed during the backup,
838 and <literal>2</> for other errors.
842 It is not necessary to be concerned about the amount of time elapsed
843 between <function>pg_start_backup</> and the start of the actual backup,
844 nor between the end of the backup and <function>pg_stop_backup</>; a
845 few minutes' delay won't hurt anything. (However, if you normally run the
846 server with <varname>full_page_writes</> disabled, you might notice a drop
847 in performance between <function>pg_start_backup</> and
848 <function>pg_stop_backup</>, since <varname>full_page_writes</> is
849 effectively forced on during backup mode.) You must ensure that these
850 steps are carried out in sequence, without any possible
851 overlap, or you will invalidate the backup.
855 Be certain that your backup dump includes all of the files under
856 the database cluster directory (e.g., <filename>/usr/local/pgsql/data</>).
857 If you are using tablespaces that do not reside underneath this directory,
858 be careful to include them as well (and be sure that your backup dump
859 archives symbolic links as links, otherwise the restore will corrupt
864 You can, however, omit from the backup dump the files within the
865 cluster's <filename>pg_xlog/</> subdirectory. This
866 slight adjustment is worthwhile because it reduces the risk
867 of mistakes when restoring. This is easy to arrange if
868 <filename>pg_xlog/</> is a symbolic link pointing to someplace outside
869 the cluster directory, which is a common setup anyway for performance
874 To make use of the backup, you will need to keep all the WAL
875 segment files generated during and after the file system backup.
876 To aid you in doing this, the <function>pg_stop_backup</> function
877 creates a <firstterm>backup history file</> that is immediately
878 stored into the WAL archive area. This file is named after the first
879 WAL segment file that you need for the file system backup.
880 For example, if the starting WAL file is
881 <literal>0000000100001234000055CD</> the backup history file will be
883 <literal>0000000100001234000055CD.007C9330.backup</>. (The second
884 part of the file name stands for an exact position within the WAL
885 file, and can ordinarily be ignored.) Once you have safely archived
886 the file system backup and the WAL segment files used during the
887 backup (as specified in the backup history file), all archived WAL
888 segments with names numerically less are no longer needed to recover
889 the file system backup and can be deleted. However, you should
890 consider keeping several backup sets to be absolutely certain that
891 you can recover your data.
895 The backup history file is just a small text file. It contains the
896 label string you gave to <function>pg_start_backup</>, as well as
897 the starting and ending times and WAL segments of the backup.
898 If you used the label to identify the associated dump file,
899 then the archived history file is enough to tell you which dump file to
904 Since you have to keep around all the archived WAL files back to your
905 last base backup, the interval between base backups should usually be
906 chosen based on how much storage you want to expend on archived WAL
907 files. You should also consider how long you are prepared to spend
908 recovering, if recovery should be necessary — the system will have to
909 replay all those WAL segments, and that could take awhile if it has
910 been a long time since the last base backup.
914 It's also worth noting that the <function>pg_start_backup</> function
915 makes a file named <filename>backup_label</> in the database cluster
916 directory, which is removed by <function>pg_stop_backup</>.
917 This file will of course be archived as a part of your backup dump file.
918 The backup label file includes the label string you gave to
919 <function>pg_start_backup</>, as well as the time at which
920 <function>pg_start_backup</> was run, and the name of the starting WAL
921 file. In case of confusion it is
922 therefore possible to look inside a backup dump file and determine
923 exactly which backup session the dump file came from.
927 It is also possible to make a backup dump while the server is
928 stopped. In this case, you obviously cannot use
929 <function>pg_start_backup</> or <function>pg_stop_backup</>, and
930 you will therefore be left to your own devices to keep track of which
931 backup dump is which and how far back the associated WAL files go.
932 It is generally better to follow the continuous archiving procedure above.
936 <sect2 id="backup-pitr-recovery">
937 <title>Recovering using a Continuous Archive Backup</title>
940 Okay, the worst has happened and you need to recover from your backup.
941 Here is the procedure:
945 Stop the server, if it's running.
950 If you have the space to do so,
951 copy the whole cluster data directory and any tablespaces to a temporary
952 location in case you need them later. Note that this precaution will
953 require that you have enough free space on your system to hold two
954 copies of your existing database. If you do not have enough space,
955 you should at least save the contents of the cluster's <filename>pg_xlog</>
956 subdirectory, as it might contain logs which
957 were not archived before the system went down.
962 Remove all existing files and subdirectories under the cluster data
963 directory and under the root directories of any tablespaces you are using.
968 Restore the database files from your file system backup. Be sure that they
969 are restored with the right ownership (the database system user, not
970 <literal>root</>!) and with the right permissions. If you are using
972 you should verify that the symbolic links in <filename>pg_tblspc/</>
973 were correctly restored.
978 Remove any files present in <filename>pg_xlog/</>; these came from the
979 file system backup and are therefore probably obsolete rather than current.
980 If you didn't archive <filename>pg_xlog/</> at all, then recreate
981 it with proper permissions,
982 being careful to ensure that you re-establish it as a symbolic link
983 if you had it set up that way before.
988 If you have unarchived WAL segment files that you saved in step 2,
989 copy them into <filename>pg_xlog/</>. (It is best to copy them,
990 not move them, so you still have the unmodified files if a
991 problem occurs and you have to start over.)
996 Create a recovery command file <filename>recovery.conf</> in the cluster
997 data directory (see <xref linkend="recovery-config">). You might
998 also want to temporarily modify <filename>pg_hba.conf</> to prevent
999 ordinary users from connecting until you are sure the recovery was successful.
1004 Start the server. The server will go into recovery mode and
1005 proceed to read through the archived WAL files it needs. Should the
1006 recovery be terminated because of an external error, the server can
1007 simply be restarted and it will continue recovery. Upon completion
1008 of the recovery process, the server will rename
1009 <filename>recovery.conf</> to <filename>recovery.done</> (to prevent
1010 accidentally re-entering recovery mode later) and then
1011 commence normal database operations.
1016 Inspect the contents of the database to ensure you have recovered to
1017 the desired state. If not, return to step 1. If all is well,
1018 allow your users to connect by restoring <filename>pg_hba.conf</> to normal.
1025 The key part of all this is to set up a recovery configuration file that
1026 describes how you want to recover and how far the recovery should
1027 run. You can use <filename>recovery.conf.sample</> (normally
1028 located in the installation's <filename>share/</> directory) as a
1029 prototype. The one thing that you absolutely must specify in
1030 <filename>recovery.conf</> is the <varname>restore_command</>,
1031 which tells <productname>PostgreSQL</> how to retrieve archived
1032 WAL file segments. Like the <varname>archive_command</>, this is
1033 a shell command string. It can contain <literal>%f</>, which is
1034 replaced by the name of the desired log file, and <literal>%p</>,
1035 which is replaced by the path name to copy the log file to.
1036 (The path name is relative to the current working directory,
1037 i.e., the cluster's data directory.)
1038 Write <literal>%%</> if you need to embed an actual <literal>%</>
1039 character in the command. The simplest useful command is
1042 restore_command = 'cp /mnt/server/archivedir/%f %p'
1044 which will copy previously archived WAL segments from the directory
1045 <filename>/mnt/server/archivedir</>. Of course, you can use something
1046 much more complicated, perhaps even a shell script that requests the
1047 operator to mount an appropriate tape.
1051 It is important that the command return nonzero exit status on failure.
1052 The command <emphasis>will</> be called requesting files that are not present
1053 in the archive; it must return nonzero when so asked. This is not an
1054 error condition. Not all of the requested files will be WAL segment
1055 files; you should also expect requests for files with a suffix of
1056 <literal>.backup</> or <literal>.history</>. Also be aware that
1057 the base name of the <literal>%p</> path will be different from
1058 <literal>%f</>; do not expect them to be interchangeable.
1062 WAL segments that cannot be found in the archive will be sought in
1063 <filename>pg_xlog/</>; this allows use of recent un-archived segments.
1064 However, segments that are available from the archive will be used in
1065 preference to files in <filename>pg_xlog/</>. The system will not
1066 overwrite the existing contents of <filename>pg_xlog/</> when retrieving
1071 Normally, recovery will proceed through all available WAL segments,
1072 thereby restoring the database to the current point in time (or as
1073 close as possible given the available WAL segments). Therefore, a normal
1074 recovery will end with a <quote>file not found</> message, the exact text
1075 of the error message depending upon your choice of
1076 <varname>restore_command</>. You may also see an error message
1077 at the start of recovery for a file named something like
1078 <filename>00000001.history</>. This is also normal and does not
1079 indicate a problem in simple recovery situations; see
1080 <xref linkend="backup-timelines"> for discussion.
1084 If you want to recover to some previous point in time (say, right before
1085 the junior DBA dropped your main transaction table), just specify the
1086 required stopping point in <filename>recovery.conf</>. You can specify
1087 the stop point, known as the <quote>recovery target</>, either by
1088 date/time or by completion of a specific transaction ID. As of this
1089 writing only the date/time option is very usable, since there are no tools
1090 to help you identify with any accuracy which transaction ID to use.
1095 The stop point must be after the ending time of the base backup, i.e.,
1096 the end time of <function>pg_stop_backup</>. You cannot use a base backup
1097 to recover to a time when that backup was in progress. (To
1098 recover to such a time, you must go back to your previous base backup
1099 and roll forward from there.)
1104 If recovery finds corrupted WAL data, recovery will
1105 halt at that point and the server will not start. In such a case the
1106 recovery process could be re-run from the beginning, specifying a
1107 <quote>recovery target</> before the point of corruption so that recovery
1108 can complete normally.
1109 If recovery fails for an external reason, such as a system crash or
1110 if the WAL archive has become inaccessible, then the recovery can simply
1111 be restarted and it will restart almost from where it failed.
1112 Recovery restart works much like checkpointing in normal operation:
1113 the server periodically forces all its state to disk, and then updates
1114 the <filename>pg_control</> file to indicate that the already-processed
1115 WAL data need not be scanned again.
1120 <sect2 id="backup-timelines">
1121 <title>Timelines</title>
1123 <indexterm zone="backup">
1124 <primary>timelines</primary>
1128 The ability to restore the database to a previous point in time creates
1129 some complexities that are akin to science-fiction stories about time
1130 travel and parallel universes. For example, in the original history of the database,
1131 suppose you dropped a critical table at 5:15PM on Tuesday evening, but
1132 didn't realize your mistake until Wednesday noon.
1133 Unfazed, you get out your backup, restore to the point-in-time 5:14PM
1134 Tuesday evening, and are up and running. In <emphasis>this</> history of
1135 the database universe, you never dropped the table. But suppose
1136 you later realize this wasn't such a great idea, and would like
1137 to return to sometime Wednesday morning in the original history.
1139 to if, while your database was up-and-running, it overwrote some of the
1140 WAL segment files that led up to the time you now wish you
1141 could get back to. Thus, to avoid this, you need to distinguish the series of
1142 WAL records generated after you've done a point-in-time recovery from
1143 those that were generated in the original database history.
1147 To deal with this problem, <productname>PostgreSQL</> has a notion
1148 of <firstterm>timelines</>. Whenever an archive recovery completes,
1149 a new timeline is created to identify the series of WAL records
1150 generated after that recovery. The timeline
1151 ID number is part of WAL segment file names so a new timeline does
1152 not overwrite the WAL data generated by previous timelines. It is
1153 in fact possible to archive many different timelines. While that might
1154 seem like a useless feature, it's often a lifesaver. Consider the
1155 situation where you aren't quite sure what point-in-time to recover to,
1156 and so have to do several point-in-time recoveries by trial and error
1157 until you find the best place to branch off from the old history. Without
1158 timelines this process would soon generate an unmanageable mess. With
1159 timelines, you can recover to <emphasis>any</> prior state, including
1160 states in timeline branches that you abandoned earlier.
1164 Every time a new timeline is created, <productname>PostgreSQL</> creates
1165 a <quote>timeline history</> file that shows which timeline it branched
1166 off from and when. These history files are necessary to allow the system
1167 to pick the right WAL segment files when recovering from an archive that
1168 contains multiple timelines. Therefore, they are archived into the WAL
1169 archive area just like WAL segment files. The history files are just
1170 small text files, so it's cheap and appropriate to keep them around
1171 indefinitely (unlike the segment files which are large). You can, if
1172 you like, add comments to a history file to record your own notes about
1173 how and why this particular timeline was created. Such comments will be
1174 especially valuable when you have a thicket of different timelines as
1175 a result of experimentation.
1179 The default behavior of recovery is to recover along the same timeline
1180 that was current when the base backup was taken. If you wish to recover
1181 into some child timeline (that is, you want to return to some state that
1182 was itself generated after a recovery attempt), you need to specify the
1183 target timeline ID in <filename>recovery.conf</>. You cannot recover into
1184 timelines that branched off earlier than the base backup.
1188 <sect2 id="backup-tips">
1189 <title>Tips and Examples</title>
1192 Some tips for configuring continuous archiving are given here.
1195 <sect3 id="backup-standalone">
1196 <title>Standalone hot backups</title>
1199 It is possible to use <productname>PostgreSQL</>'s backup facilities to
1200 produce standalone hot backups. These are backups that cannot be used
1201 for point-in-time recovery, yet are typically much faster to backup and
1202 restore than <application>pg_dump</> dumps. (They are also much larger
1203 than <application>pg_dump</> dumps, so in some cases the speed advantage
1208 To prepare for standalone hot backups, set <varname>wal_level</> to
1209 <literal>archive</> (or <literal>hot_standby</>), <varname>archive_mode</> to
1210 <literal>on</>, and set up an <varname>archive_command</> that performs
1211 archiving only when a <emphasis>switch file</> exists. For example:
1213 archive_command = 'test ! -f /var/lib/pgsql/backup_in_progress || cp -i %p /var/lib/pgsql/archive/%f < /dev/null'
1215 This command will perform archiving when
1216 <filename>/var/lib/pgsql/backup_in_progress</> exists, and otherwise
1217 silently return zero exit status (allowing <productname>PostgreSQL</>
1218 to recycle the unwanted WAL file).
1222 With this preparation, a backup can be taken using a script like the
1225 touch /var/lib/pgsql/backup_in_progress
1226 psql -c "select pg_start_backup('hot_backup');"
1227 tar -cf /var/lib/pgsql/backup.tar /var/lib/pgsql/data/
1228 psql -c "select pg_stop_backup();"
1229 rm /var/lib/pgsql/backup_in_progress
1230 tar -rf /var/lib/pgsql/backup.tar /var/lib/pgsql/archive/
1232 The switch file <filename>/var/lib/pgsql/backup_in_progress</> is
1233 created first, enabling archiving of completed WAL files to occur.
1234 After the backup the switch file is removed. Archived WAL files are
1235 then added to the backup so that both base backup and all required
1236 WAL files are part of the same <application>tar</> file.
1237 Please remember to add error handling to your backup scripts.
1241 If archive storage size is a concern, use <application>pg_compresslog</>,
1242 <ulink url="http://pglesslog.projects.postgresql.org"></ulink>, to
1243 remove unnecessary <xref linkend="guc-full-page-writes"> and trailing
1244 space from the WAL files. You can then use
1245 <application>gzip</application> to further compress the output of
1246 <application>pg_compresslog</>:
1248 archive_command = 'pg_compresslog %p - | gzip > /var/lib/pgsql/archive/%f'
1250 You will then need to use <application>gunzip</> and
1251 <application>pg_decompresslog</> during recovery:
1253 restore_command = 'gunzip < /mnt/server/archivedir/%f | pg_decompresslog - %p'
1258 <sect3 id="backup-scripts">
1259 <title><varname>archive_command</varname> scripts</title>
1262 Many people choose to use scripts to define their
1263 <varname>archive_command</varname>, so that their
1264 <filename>postgresql.conf</> entry looks very simple:
1266 archive_command = 'local_backup_script.sh'
1268 Using a separate script file is advisable any time you want to use
1269 more than a single command in the archiving process.
1270 This allows all complexity to be managed within the script, which
1271 can be written in a popular scripting language such as
1272 <application>bash</> or <application>perl</>.
1273 Any messages written to <literal>stderr</> from the script will appear
1274 in the database server log, allowing complex configurations to be
1275 diagnosed easily if they fail.
1279 Examples of requirements that might be solved within a script include:
1283 Copying data to secure off-site data storage
1288 Batching WAL files so that they are transferred every three hours,
1289 rather than one at a time
1294 Interfacing with other backup and recovery software
1299 Interfacing with monitoring software to report errors
1307 <sect2 id="continuous-archiving-caveats">
1308 <title>Caveats</title>
1311 At this writing, there are several limitations of the continuous archiving
1312 technique. These will probably be fixed in future releases:
1317 Operations on hash indexes are not presently WAL-logged, so
1318 replay will not update these indexes. This will mean that any new inserts
1319 will be ignored by the index, updated rows will apparently disappear and
1320 deleted rows will still retain pointers. In other words, if you modify a
1321 table with a hash index on it then you will get incorrect query results
1322 on a standby server. When recovery completes it is recommended that you
1323 manually <xref linkend="sql-reindex">
1324 each such index after completing a recovery operation.
1330 If a <xref linkend="sql-createdatabase">
1331 command is executed while a base backup is being taken, and then
1332 the template database that the <command>CREATE DATABASE</> copied
1333 is modified while the base backup is still in progress, it is
1334 possible that recovery will cause those modifications to be
1335 propagated into the created database as well. This is of course
1336 undesirable. To avoid this risk, it is best not to modify any
1337 template databases while taking a base backup.
1343 <xref linkend="sql-createtablespace">
1344 commands are WAL-logged with the literal absolute path, and will
1345 therefore be replayed as tablespace creations with the same
1346 absolute path. This might be undesirable if the log is being
1347 replayed on a different machine. It can be dangerous even if the
1348 log is being replayed on the same machine, but into a new data
1349 directory: the replay will still overwrite the contents of the
1350 original tablespace. To avoid potential gotchas of this sort,
1351 the best practice is to take a new base backup after creating or
1352 dropping tablespaces.
1359 It should also be noted that the default <acronym>WAL</acronym>
1360 format is fairly bulky since it includes many disk page snapshots.
1361 These page snapshots are designed to support crash recovery, since
1362 we might need to fix partially-written disk pages. Depending on
1363 your system hardware and software, the risk of partial writes might
1364 be small enough to ignore, in which case you can significantly
1365 reduce the total volume of archived logs by turning off page
1366 snapshots using the <xref linkend="guc-full-page-writes">
1367 parameter. (Read the notes and warnings in <xref linkend="wal">
1368 before you do so.) Turning off page snapshots does not prevent
1369 use of the logs for PITR operations. An area for future
1370 development is to compress archived WAL data by removing
1371 unnecessary page copies even when <varname>full_page_writes</> is
1372 on. In the meantime, administrators might wish to reduce the number
1373 of page snapshots included in WAL by increasing the checkpoint
1374 interval parameters as much as feasible.
1379 <sect1 id="migration">
1380 <title>Migration Between Releases</title>
1382 <indexterm zone="migration">
1383 <primary>upgrading</primary>
1386 <indexterm zone="migration">
1387 <primary>version</primary>
1388 <secondary>compatibility</secondary>
1392 This section discusses how to migrate your database data from one
1393 <productname>PostgreSQL</> release to a newer one.
1394 The software installation procedure <foreignphrase>per se</> is not the
1395 subject of this section; those details are in <xref linkend="installation">.
1399 <productname>PostgreSQL</> major versions are represented by the
1400 first two digit groups of the version number, e.g., 8.4.
1401 <productname>PostgreSQL</> minor versions are represented by the
1402 third group of version digits, e.g., 8.4.2 is the second minor
1403 release of 8.4. Minor releases never change the internal storage
1404 format and are always compatible with earlier and later minor
1405 releases of the same major version number, e.g., 8.4.2 is compatible
1406 with 8.4, 8.4.1 and 8.4.6. To update between compatible versions,
1407 you simply replace the executables while the server is down and
1408 restart the server. The data directory remains unchanged —
1409 minor upgrades are that simple.
1413 For <emphasis>major</> releases of <productname>PostgreSQL</>, the
1414 internal data storage format is subject to change, thus complicating
1415 upgrades. The traditional method for moving data to a new major version
1416 is to dump and reload the database. Other, less-well-tested possibilities
1417 are available, as discussed below.
1421 New major versions also typically introduce some user-visible
1422 incompatibilities, so application programming changes may be required.
1423 Cautious users will want to test their client applications on the new
1424 version before switching over fully; therefore, it's often a good idea to
1425 set up concurrent installations of old and new versions. When
1426 testing a <productname>PostgreSQL</> major upgrade, consider the
1427 following categories of possible changes:
1433 <term>Administration</term>
1436 The capabilities available for administrators to monitor and control
1437 the server often change and improve in each major release.
1446 Typically this includes new SQL command capabilities and not changes
1447 in behavior, unless specifically mentioned in the release notes.
1453 <term>Library API</term>
1456 Typically libraries like <application>libpq</> only add new
1457 functionality, again unless mentioned in the release notes.
1463 <term>System Catalogs</term>
1466 System catalog changes usually only affect database management tools.
1472 <term>Server C-language API</term>
1475 This involves changes in the backend function API, which is written
1476 in the C programming language. Such changes affect code that
1477 references backend functions deep inside the server.
1484 <sect2 id="migration-methods-pgdump">
1485 <title>Migrating data via <application>pg_dump</></title>
1488 To dump data from one major version of <productname>PostgreSQL</> and
1489 reload it in another, you must use <application>pg_dump</>; file system
1490 level backup methods will not work. (There are checks in place that prevent
1491 you from using a data directory with an incompatible version of
1492 <productname>PostgreSQL</productname>, so no great harm can be done by
1493 trying to start the wrong server version on a data directory.)
1497 It is recommended that you use the <application>pg_dump</> and
1498 <application>pg_dumpall</> programs from the newer version of
1499 <productname>PostgreSQL</>, to take advantage of enhancements
1500 that might have been made in these programs. Current releases of the
1501 dump programs can read data from any server version back to 7.0.
1505 The least downtime can be achieved by installing the new server in
1506 a different directory and running both the old and the new servers
1507 in parallel, on different ports. Then you can use something like:
1510 pg_dumpall -p 5432 | psql -d postgres -p 6543
1513 to transfer your data. Or you can use an intermediate file if you wish.
1514 Then you can shut down the old server and start the new server using
1515 the port the old one was running on. You should make sure that the
1516 old database is not updated after you begin to run
1517 <application>pg_dumpall</>, otherwise you will lose those updates. See
1518 <xref linkend="client-authentication"> for information on how to prohibit
1523 If you cannot or do not want to run two servers in parallel, you can
1524 do the backup step before installing the new version, bring down
1525 the old server, move the old version out of the way, install the new
1526 version, start the new server, and restore the data. For example:
1529 pg_dumpall > backup
1531 mv /usr/local/pgsql /usr/local/pgsql.old
1532 # Rename any tablespace directories as well
1533 cd ~/postgresql-&version;
1535 initdb -D /usr/local/pgsql/data
1536 postgres -D /usr/local/pgsql/data
1537 psql -f backup postgres
1540 See <xref linkend="runtime"> about ways to start and stop the
1541 server and other details. The installation instructions will advise
1542 you of strategic places to perform these steps.
1547 When you <quote>move the old installation out of the way</quote>
1548 it might no longer be perfectly usable. Some of the executable programs
1549 contain absolute paths to various installed programs and data files.
1550 This is usually not a big problem, but if you plan on using two
1551 installations in parallel for a while you should assign them
1552 different installation directories at build time. (This problem
1553 is rectified in <productname>PostgreSQL</> version 8.0 and later, so long
1554 as you move all subdirectories containing installed files together;
1555 for example if <filename>/usr/local/postgres/bin/</> goes to
1556 <filename>/usr/local/postgres.old/bin/</>, then
1557 <filename>/usr/local/postgres/share/</> must go to
1558 <filename>/usr/local/postgres.old/share/</>. In pre-8.0 releases
1559 moving an installation like this will not work.)
1564 <sect2 id="migration-methods-other">
1565 <title>Other data migration methods</title>
1568 The <filename>contrib</> program
1569 <link linkend="pgupgrade"><application>pg_upgrade</application></link>
1570 allows an installation to be migrated in-place from one major
1571 <productname>PostgreSQL</> version to the next. Keep in mind that this
1572 method does not provide any scope for running old and new versions
1573 concurrently. Also, <application>pg_upgrade</application> is much less
1574 battle-tested than <application>pg_dump</application>, so having an
1575 up-to-date backup is strongly recommended in case something goes wrong.
1579 It is also possible to use certain replication methods, such as
1580 <productname>Slony</>, to create a standby server with the updated version of
1581 <productname>PostgreSQL</>. The standby can be on the same computer or
1582 a different computer. Once it has synced up with the master server
1583 (running the older version of <productname>PostgreSQL</>), you can
1584 switch masters and make the standby the master and shut down the older
1585 database instance. Such a switch-over results in only several seconds
1586 of downtime for an upgrade.