From 958db06181a41ac3d63bb55478fbad9b55b58e6a Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas <heikki.linnakangas@iki.fi> Date: Fri, 28 Mar 2008 15:00:28 +0000 Subject: [PATCH] Clarify documentation on PITR and warm standby on the fact that the standby restore_command should report failure on non-existent .backup and .history files. Tidy up some related text along the way. Patch by Markus Bertheau, with some editing by Simon Riggs and myself. --- doc/src/sgml/backup.sgml | 50 ++++++++++++++++++++++------------------ 1 file changed, 27 insertions(+), 23 deletions(-) diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index 8131bcdf84..9f369f8e6d 100644 --- a/doc/src/sgml/backup.sgml +++ b/doc/src/sgml/backup.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.115 2008/03/07 01:46:41 momjian Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.116 2008/03/28 15:00:28 heikki Exp $ --> <chapter id="backup"> <title>Backup and Restore</title> @@ -577,11 +577,10 @@ cp -i pg_xlog/00000001000000A900000065 /mnt/server/archivedir/00000001000000A900 <para> It is important that the archive command return zero exit status if and only if it succeeded. Upon getting a zero result, - <productname>PostgreSQL</> will assume that the WAL segment file has been - successfully archived, and will remove or recycle it. - However, a nonzero status tells - <productname>PostgreSQL</> that the file was not archived; it will try - again periodically until it succeeds. + <productname>PostgreSQL</> will assume that the file has been + successfully archived, and will remove or recycle it. However, a nonzero + status tells <productname>PostgreSQL</> that the file was not archived; + it will try again periodically until it succeeds. </para> <para> @@ -1001,11 +1000,13 @@ restore_command = 'cp /mnt/server/archivedir/%f %p' <para> It is important that the command return nonzero exit status on failure. - The command <emphasis>will</> be asked for log files that are not present + The command <emphasis>will</> be asked for files that are not present in the archive; it must return nonzero when so asked. This is not an - error condition. Be aware also that the base name of the <literal>%p</> - path will be different from <literal>%f</>; do not expect them to be - interchangeable. + error condition. Not all of the requested files will be WAL segment + files; you should also expect requests for files with a suffix of + <literal>.backup</> or <literal>.history</>. Also be aware that + the base name of the <literal>%p</> path will be different from + <literal>%f</>; do not expect them to be interchangeable. </para> <para> @@ -1576,19 +1577,21 @@ archive_command = 'local_backup_script.sh' <para> The magic that makes the two loosely coupled servers work together is - simply a <varname>restore_command</> used on the standby that waits - for the next WAL file to become available from the primary. The - <varname>restore_command</> is specified in the + simply a <varname>restore_command</> used on the standby that, + when asked for the next WAL file, waits for it to become available from + the primary. The <varname>restore_command</> is specified in the <filename>recovery.conf</> file on the standby server. Normal recovery processing would request a file from the WAL archive, reporting failure if the file was unavailable. For standby processing it is normal for - the next file to be unavailable, so we must be patient and wait for - it to appear. A waiting <varname>restore_command</> can be written as - a custom script that loops after polling for the existence of the next - WAL file. There must also be some way to trigger failover, which should - interrupt the <varname>restore_command</>, break the loop and return - a file-not-found error to the standby server. This ends recovery and - the standby will then come up as a normal server. + the next WAL file to be unavailable, so we must be patient and wait for + it to appear. For files ending in <literal>.backup</> or + <literal>.history</> there is no need to wait, and a non-zero return + code must be returned. A waiting <varname>restore_command</> can be + written as a custom script that loops after polling for the existence of + the next WAL file. There must also be some way to trigger failover, which + should interrupt the <varname>restore_command</>, break the loop and + return a file-not-found error to the standby server. This ends recovery + and the standby will then come up as a normal server. </para> <para> @@ -1608,9 +1611,10 @@ if (!triggered) <para> A working example of a waiting <varname>restore_command</> is provided - as a <filename>contrib</> module named <application>pg_standby</>. This - example can be extended as needed to support specific configurations or - environments. + as a <filename>contrib</> module named <application>pg_standby</>. It + should be used as a reference on how to correctly implement the logic + described above. It can also be extended as needed to support specific + configurations or environments. </para> <para> -- 2.49.0