From: Heikki Linnakangas Date: Tue, 14 Oct 2014 06:55:26 +0000 (+0300) Subject: Fix deadlock with LWLockAcquireWithVar and LWLockWaitForVar. X-Git-Tag: REL9_4_RC1~80 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=4dbc7606cfc8188646a2e302ef5e6a5ec3c962af;p=postgresql Fix deadlock with LWLockAcquireWithVar and LWLockWaitForVar. LWLockRelease should release all backends waiting with LWLockWaitForVar, even when another backend has already been woken up to acquire the lock, i.e. when releaseOK is false. LWLockWaitForVar can return as soon as the protected value changes, even if the other backend will acquire the lock. Fix that by resetting releaseOK to true in LWLockWaitForVar, whenever adding itself to the wait queue. This should fix the bug reported by MauMau, where the system occasionally hangs when there is a lot of concurrent WAL activity and a checkpoint. Backpatch to 9.4, where this code was added. --- diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c index 5453549a79..1607bc9e53 100644 --- a/src/backend/storage/lmgr/lwlock.c +++ b/src/backend/storage/lmgr/lwlock.c @@ -976,6 +976,12 @@ LWLockWaitForVar(LWLock *l, uint64 *valptr, uint64 oldval, uint64 *newval) lock->tail = proc; lock->head = proc; + /* + * Set releaseOK, to make sure we get woken up as soon as the lock is + * released. + */ + lock->releaseOK = true; + /* Can release the mutex now */ SpinLockRelease(&lock->mutex);