]> granicus.if.org Git - postgresql/commitdiff
If a corrupt WAL record is received by streaming replication, disconnect
authorHeikki Linnakangas <heikki.linnakangas@iki.fi>
Mon, 14 Jun 2010 06:04:21 +0000 (06:04 +0000)
committerHeikki Linnakangas <heikki.linnakangas@iki.fi>
Mon, 14 Jun 2010 06:04:21 +0000 (06:04 +0000)
and retry. If the record is genuinely corrupt in the master database,
there's little hope of recovering, but it's better than simply retrying
to apply the corrupt WAL record in a tight loop without even trying to
retransmit it, which is what we used to do.

src/backend/access/transam/xlog.c

index a72d7f24da03d1ce33ec64685227bce968dfe8df..5787b3d164c95bba1732d9d416fc65e0c6a044de 100644 (file)
@@ -7,7 +7,7 @@
  * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
- * $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.423 2010/06/12 09:14:52 petere Exp $
+ * $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.424 2010/06/14 06:04:21 heikki Exp $
  *
  *-------------------------------------------------------------------------
  */
@@ -9270,6 +9270,22 @@ retry:
                        {
                                if (WalRcvInProgress())
                                {
+                                       /*
+                                        * If we find an invalid record in the WAL streamed from
+                                        * master, something is seriously wrong. There's little
+                                        * chance that the problem will just go away, but PANIC
+                                        * is not good for availability either, especially in
+                                        * hot standby mode. Disconnect, and retry from
+                                        * archive/pg_xlog again. The WAL in the archive should
+                                        * be identical to what was streamed, so it's unlikely
+                                        * that it helps, but one can hope...
+                                        */
+                                       if (failedSources & XLOG_FROM_STREAM)
+                                       {
+                                               ShutdownWalRcv();
+                                               continue;
+                                       }
+
                                        /*
                                         * While walreceiver is active, wait for new WAL to arrive
                                         * from primary.