]> granicus.if.org Git - postgresql/commitdiff
Fix the buffer release order for parallel index scans.
authorAmit Kapila <akapila@postgresql.org>
Fri, 27 Jul 2018 05:35:06 +0000 (11:05 +0530)
committerAmit Kapila <akapila@postgresql.org>
Fri, 27 Jul 2018 05:35:06 +0000 (11:05 +0530)
During parallel index scans, if the current page to be read is deleted, we
skip it and try to get the next page for a scan without releasing the buffer
lock on the current page.  To get the next page, sometimes it needs to wait
for another process to complete its scan and advance it to the next page.
Now, it is quite possible that the master backend has errored out before
advancing the scan and issued a termination signal for all workers.  The
workers failed to notice the termination request during wait because the
interrupts are held due to buffer lock on the previous page.  This lead to
all workers being stuck.

The fix is to release the buffer lock on current page before trying to get
the next page.  We are already doing same in backward scans, but missed
it for forward scans.

Reported-by: Victor Yegorov
Bug: 15290
Diagnosed-by: Thomas Munro and Amit Kapila
Author: Amit Kapila
Reviewed-by: Thomas Munro
Tested-By: Thomas Munro and Victor Yegorov
Backpatch-through: 10 where parallel index scans were introduced
Discussion: https://postgr.es/m/153228422922.1395.1746424054206154747@wrigleys.postgresql.org

src/backend/access/nbtree/nbtsearch.c

index 0151f2c91d1897cc677d5867b6f8cef363077982..798ebbfceaecae4d45a98e0ceb1d165e39ba2577 100644 (file)
@@ -1495,17 +1495,19 @@ _bt_readnextpage(IndexScanDesc scan, BlockNumber blkno, ScanDirection dir)
                        /* nope, keep going */
                        if (scan->parallel_scan != NULL)
                        {
+                               _bt_relbuf(rel, so->currPos.buf);
                                status = _bt_parallel_seize(scan, &blkno);
                                if (!status)
                                {
-                                       _bt_relbuf(rel, so->currPos.buf);
                                        BTScanPosInvalidate(so->currPos);
                                        return false;
                                }
                        }
                        else
+                       {
                                blkno = opaque->btpo_next;
-                       _bt_relbuf(rel, so->currPos.buf);
+                               _bt_relbuf(rel, so->currPos.buf);
+                       }
                }
        }
        else