From 09a5be587b6546aa10a71b112683c115d0b11586 Mon Sep 17 00:00:00 2001 From: Amit Kapila Date: Fri, 27 Jul 2018 10:56:07 +0530 Subject: [PATCH] Fix the buffer release order for parallel index scans. During parallel index scans, if the current page to be read is deleted, we skip it and try to get the next page for a scan without releasing the buffer lock on the current page. To get the next page, sometimes it needs to wait for another process to complete its scan and advance it to the next page. Now, it is quite possible that the master backend has errored out before advancing the scan and issued a termination signal for all workers. The workers failed to notice the termination request during wait because the interrupts are held due to buffer lock on the previous page. This lead to all workers being stuck. The fix is to release the buffer lock on current page before trying to get the next page. We are already doing same in backward scans, but missed it for forward scans. Reported-by: Victor Yegorov Bug: 15290 Diagnosed-by: Thomas Munro and Amit Kapila Author: Amit Kapila Reviewed-by: Thomas Munro Tested-By: Thomas Munro and Victor Yegorov Backpatch-through: 10 where parallel index scans were introduced Discussion:https://postgr.es/m/153228422922.1395.1746424054206154747@wrigleys.postgresql.org --- src/backend/access/nbtree/nbtsearch.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c index 0bcfa10b86..6831bc8c03 100644 --- a/src/backend/access/nbtree/nbtsearch.c +++ b/src/backend/access/nbtree/nbtsearch.c @@ -1497,17 +1497,19 @@ _bt_readnextpage(IndexScanDesc scan, BlockNumber blkno, ScanDirection dir) /* nope, keep going */ if (scan->parallel_scan != NULL) { + _bt_relbuf(rel, so->currPos.buf); status = _bt_parallel_seize(scan, &blkno); if (!status) { - _bt_relbuf(rel, so->currPos.buf); BTScanPosInvalidate(so->currPos); return false; } } else + { blkno = opaque->btpo_next; - _bt_relbuf(rel, so->currPos.buf); + _bt_relbuf(rel, so->currPos.buf); + } } } else -- 2.40.0