]> granicus.if.org Git - postgresql/commitdiff
Fix postmaster to attempt restart after a hot-standby crash.
authorTom Lane <tgl@sss.pgh.pa.us>
Mon, 6 Feb 2012 20:29:26 +0000 (15:29 -0500)
committerTom Lane <tgl@sss.pgh.pa.us>
Mon, 6 Feb 2012 20:29:41 +0000 (15:29 -0500)
The postmaster was coded to treat any unexpected exit of the startup
process (i.e., the WAL replay process) as a catastrophic crash, and not try
to restart it. This was OK so long as the startup process could not have
any sibling postmaster children.  However, if a hot-standby backend
crashes, we SIGQUIT the startup process along with everything else, and the
resulting exit is hardly "unexpected".  Treating it as such meant we failed
to restart a standby server after any child crash at all, not only a crash
of the WAL replay process as intended.  Adjust that.  Back-patch to 9.0
where hot standby was introduced.

src/backend/postmaster/postmaster.c

index 367aa0ca50649572ccb2795cf31d09994cf0abed..067edcfe290d19f2d9bb24d71cf21504d676e1a4 100644 (file)
@@ -2391,13 +2391,18 @@ reaper(SIGNAL_ARGS)
                        }
 
                        /*
-                        * Any unexpected exit (including FATAL exit) of the startup
-                        * process is treated as a crash, except that we don't want to
-                        * reinitialize.
+                        * After PM_STARTUP, any unexpected exit (including FATAL exit) of
+                        * the startup process is catastrophic, so kill other children,
+                        * and set RecoveryError so we don't try to reinitialize after
+                        * they're gone.  Exception: if FatalError is already set, that
+                        * implies we previously sent the startup process a SIGQUIT, so
+                        * that's probably the reason it died, and we do want to try to
+                        * restart in that case.
                         */
                        if (!EXIT_STATUS_0(exitstatus))
                        {
-                               RecoveryError = true;
+                               if (!FatalError)
+                                       RecoveryError = true;
                                HandleChildCrash(pid, exitstatus,
                                                                 _("startup process"));
                                continue;