]> granicus.if.org Git - postgresql/commitdiff
Fix the buffer release order for parallel index scans.
authorAmit Kapila <akapila@postgresql.org>
Fri, 27 Jul 2018 05:26:07 +0000 (10:56 +0530)
committerAmit Kapila <akapila@postgresql.org>
Fri, 27 Jul 2018 05:26:07 +0000 (10:56 +0530)
During parallel index scans, if the current page to be read is deleted, we
skip it and try to get the next page for a scan without releasing the buffer
lock on the current page.  To get the next page, sometimes it needs to wait
for another process to complete its scan and advance it to the next page.
Now, it is quite possible that the master backend has errored out before
advancing the scan and issued a termination signal for all workers.  The
workers failed to notice the termination request during wait because the
interrupts are held due to buffer lock on the previous page.  This lead to
all workers being stuck.

The fix is to release the buffer lock on current page before trying to get
the next page.  We are already doing same in backward scans, but missed
it for forward scans.

Reported-by: Victor Yegorov
Bug: 15290
Diagnosed-by: Thomas Munro and Amit Kapila
Author: Amit Kapila
Reviewed-by: Thomas Munro
Tested-By: Thomas Munro and Victor Yegorov
Backpatch-through: 10 where parallel index scans were introduced
Discussion:https://postgr.es/m/153228422922.1395.1746424054206154747@wrigleys.postgresql.org

src/backend/access/nbtree/nbtsearch.c

index 0bcfa10b8647d66845a60d68094482551275aa8a..6831bc8c032e8cb06745eca3220f3f403180aebc 100644 (file)
@@ -1497,17 +1497,19 @@ _bt_readnextpage(IndexScanDesc scan, BlockNumber blkno, ScanDirection dir)
                        /* nope, keep going */
                        if (scan->parallel_scan != NULL)
                        {
+                               _bt_relbuf(rel, so->currPos.buf);
                                status = _bt_parallel_seize(scan, &blkno);
                                if (!status)
                                {
-                                       _bt_relbuf(rel, so->currPos.buf);
                                        BTScanPosInvalidate(so->currPos);
                                        return false;
                                }
                        }
                        else
+                       {
                                blkno = opaque->btpo_next;
-                       _bt_relbuf(rel, so->currPos.buf);
+                               _bt_relbuf(rel, so->currPos.buf);
+                       }
                }
        }
        else