]> granicus.if.org Git - postgresql/commitdiff
Fix pg_receivexlog --slot so that it doesn't prevent the server shutdown.
authorFujii Masao <fujii@postgresql.org>
Wed, 19 Nov 2014 05:11:48 +0000 (14:11 +0900)
committerFujii Masao <fujii@postgresql.org>
Wed, 19 Nov 2014 05:11:48 +0000 (14:11 +0900)
When pg_receivexlog --slot is connecting to the server, at the shutdown
of the server, walsender keeps waiting for the last WAL record to be
replicated and flushed in pg_receivexlog. But previously pg_receivexlog
issued sync command only when WAL file was switched. So there was
the case where the last WAL was never flushed and walsender had to
keep waiting infinitely. This caused the server shutdown to get stuck.

pg_recvlogical handles this problem by calling fsync() when it receives
the request of immediate reply from the server. That is, at shutdown,
walsender sends the request, pg_recvlogical receives it, flushes the last
WAL record, and sends the flush location back to the server. Since
walsender can see that the last WAL record is successfully flushed, it can
exit cleanly.

This commit introduces the same logic as pg_recvlogical has,
to pg_receivexlog.

Back-patch to 9.4 where pg_receivexlog was changed so that it can use
the replication slot.

Original patch by Michael Paquier, rewritten by me.
Bug report by Furuya Osamu.

src/bin/pg_basebackup/receivelog.c

index 8f360ec3e4607003a85863d974b75832eb5d9bb6..48a34cb5546876236bf026d429d792b3b6222184 100644 (file)
@@ -918,6 +918,25 @@ HandleCopyStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline,
                        /* If the server requested an immediate reply, send one. */
                        if (replyRequested && still_sending)
                        {
+                               if (reportFlushPosition && lastFlushPosition < blockpos &&
+                                       walfile != 1)
+                               {
+                                       /*
+                                        * If a valid flush location needs to be reported,
+                                        * flush the current WAL file so that the latest flush
+                                        * location is sent back to the server. This is necessary to
+                                        * see whether the last WAL data has been successfully
+                                        * replicated or not, at the normal shutdown of the server.
+                                        */
+                                       if (fsync(walfile) != 0)
+                                       {
+                                               fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
+                                                               progname, current_walfile_name, strerror(errno));
+                                               goto error;
+                                       }
+                                       lastFlushPosition = blockpos;
+                               }
+
                                now = feGetCurrentTimestamp();
                                if (!sendFeedback(conn, blockpos, now, false))
                                        goto error;