]> granicus.if.org Git - postgresql/commitdiff
Create a function to reliably identify which sessions block which others.
authorTom Lane <tgl@sss.pgh.pa.us>
Mon, 22 Feb 2016 19:31:43 +0000 (14:31 -0500)
committerTom Lane <tgl@sss.pgh.pa.us>
Mon, 22 Feb 2016 19:31:43 +0000 (14:31 -0500)
This patch introduces "pg_blocking_pids(int) returns int[]", which returns
the PIDs of any sessions that are blocking the session with the given PID.
Historically people have obtained such information using a self-join on
the pg_locks view, but it's unreasonably tedious to do it that way with any
modicum of correctness, and the addition of parallel queries has pretty
much broken that approach altogether.  (Given some more columns in the view
than there are today, you could imagine handling parallel-query cases with
a 4-way join; but ugh.)

The new function has the following behaviors that are painful or impossible
to get right via pg_locks:

1. Correctly understands which lock modes block which other ones.

2. In soft-block situations (two processes both waiting for conflicting lock
modes), only the one that's in front in the wait queue is reported to
block the other.

3. In parallel-query cases, reports all sessions blocking any member of
the given PID's lock group, and reports a session by naming its leader
process's PID, which will be the pg_backend_pid() value visible to
clients.

The motivation for doing this right now is mostly to fix the isolation
tests.  Commit 38f8bdcac4982215beb9f65a19debecaf22fd470 lobotomized
isolationtester's is-it-waiting query by removing its ability to recognize
nonconflicting lock modes, as a crude workaround for the inability to
handle soft-block situations properly.  But even without the lock mode
tests, the old query was excessively slow, particularly in
CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new
deadlock-hard test because the deadlock timeout elapses before they can
probe the waiting status of all eight sessions.  Replacing the pg_locks
self-join with use of pg_blocking_pids() is not only much more correct, but
a lot faster: I measure it at about 9X faster in a typical dev build with
Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds.  That should provide
enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the
test, without having to lengthen deadlock_timeout yet more and thus slow
down the test for everyone else.

doc/src/sgml/catalogs.sgml
doc/src/sgml/func.sgml
src/backend/storage/ipc/procarray.c
src/backend/storage/lmgr/lock.c
src/backend/utils/adt/lockfuncs.c
src/include/catalog/catversion.h
src/include/catalog/pg_proc.h
src/include/storage/lock.h
src/include/storage/procarray.h
src/include/utils/builtins.h
src/test/isolation/isolationtester.c

index d77e99988ff161fd7daabc9d7c539aaea5513989..951f59b76c1f196424d437d2f57b8b2a6ecbde2b 100644 (file)
 
      <row>
       <entry><link linkend="view-pg-locks"><structname>pg_locks</structname></link></entry>
-      <entry>currently held locks</entry>
+      <entry>locks currently held or awaited</entry>
      </row>
 
      <row>
 
   <para>
    The view <structname>pg_locks</structname> provides access to
-   information about the locks held by open transactions within the
+   information about the locks held by active processes within the
    database server.  See <xref linkend="mvcc"> for more discussion
    of locking.
   </para>
 
   <para>
    <structname>pg_locks</structname> contains one row per active lockable
-   object, requested lock mode, and relevant transaction.  Thus, the same
+   object, requested lock mode, and relevant process.  Thus, the same
    lockable object might
-   appear many times, if multiple transactions are holding or waiting
+   appear many times, if multiple processes are holding or waiting
    for locks on it.  However, an object that currently has no locks on it
    will not appear at all.
   </para>
 
   <para>
    <structfield>granted</structfield> is true in a row representing a lock
-   held by the indicated transaction.  False indicates that this transaction is
-   currently waiting to acquire this lock, which implies that some other
-   transaction is holding a conflicting lock mode on the same lockable object.
-   The waiting transaction will sleep until the other lock is released (or a
-   deadlock situation is detected). A single transaction can be waiting to
-   acquire at most one lock at a time.
+   held by the indicated process.  False indicates that this process is
+   currently waiting to acquire this lock, which implies that at least one
+   other process is holding or waiting for a conflicting lock mode on the same
+   lockable object.  The waiting process will sleep until the other lock is
+   released (or a deadlock situation is detected).  A single process can be
+   waiting to acquire at most one lock at a time.
   </para>
 
   <para>
-   Every transaction holds an exclusive lock on its virtual transaction ID for
-   its entire duration.  If a permanent ID is assigned to the transaction
-   (which normally happens only if the transaction changes the state of the
-   database), it also holds an exclusive lock on its permanent transaction ID
-   until it ends.  When one transaction finds it necessary to wait specifically
-   for another transaction, it does so by attempting to acquire share lock on
-   the other transaction ID (either virtual or permanent ID depending on the
-   situation). That will succeed only when the other transaction
-   terminates and releases its locks.
+   Throughout running a transaction, a server process holds an exclusive lock
+   on the transaction's virtual transaction ID.  If a permanent ID is assigned
+   to the transaction (which normally happens only if the transaction changes
+   the state of the database), it also holds an exclusive lock on the
+   transaction's permanent transaction ID until it ends.  When a process finds
+   it necessary to wait specifically for another transaction to end, it does
+   so by attempting to acquire share lock on the other transaction's ID
+   (either virtual or permanent ID depending on the situation). That will
+   succeed only when the other transaction terminates and releases its locks.
   </para>
 
   <para>
    Although tuples are a lockable type of object,
    information about row-level locks is stored on disk, not in memory,
    and therefore row-level locks normally do not appear in this view.
-   If a transaction is waiting for a
+   If a process is waiting for a
    row-level lock, it will usually appear in the view as waiting for the
    permanent transaction ID of the current holder of that row lock.
   </para>
    <structfield>pid</structfield> column of the <link
    linkend="pg-stat-activity-view"><structname>pg_stat_activity</structname></link>
    view to get more
-   information on the session holding or waiting to hold each lock,
+   information on the session holding or awaiting each lock,
    for example
 <programlisting>
 SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa
@@ -8280,6 +8280,20 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 </programlisting>
   </para>
 
+  <para>
+   While it is possible to obtain information about which processes block
+   which other processes by joining <structname>pg_locks</structname> against
+   itself, this is very difficult to get right in detail.  Such a query would
+   have to encode knowledge about which lock modes conflict with which
+   others.  Worse, the <structname>pg_locks</structname> view does not expose
+   information about which processes are ahead of which others in lock wait
+   queues, nor information about which processes are parallel workers running
+   on behalf of which other client sessions.  It is better to use
+   the <function>pg_blocking_pids()</> function
+   (see <xref linkend="functions-info-session-table">) to identify which
+   process(es) a waiting process is blocked behind.
+  </para>
+
   <para>
    The <structname>pg_locks</structname> view displays data from both the
    regular lock manager and the predicate lock manager, which are
index b001ce548d88fb1a63eb8e9a26d489f8cb56cfeb..c0b94bc072867349a18cd106aaecabf69915ad0b 100644 (file)
@@ -14996,6 +14996,12 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
        </entry>
       </row>
 
+      <row>
+       <entry><literal><function>pg_blocking_pids(<type>int</type>)</function></literal></entry>
+       <entry><type>int[]</type></entry>
+       <entry>Process ID(s) that are blocking specified server process ID</entry>
+      </row>
+
       <row>
        <entry><literal><function>pg_conf_load_time()</function></literal></entry>
        <entry><type>timestamp with time zone</type></entry>
@@ -15183,6 +15189,29 @@ SET search_path TO <replaceable>schema</> <optional>, <replaceable>schema</>, ..
      Unix-domain socket.
    </para>
 
+   <indexterm>
+    <primary>pg_blocking_pids</primary>
+   </indexterm>
+
+   <para>
+    <function>pg_blocking_pids</function> returns an array of the process IDs
+    of the sessions that are blocking the server process with the specified
+    process ID, or an empty array if there is no such server process or it is
+    not blocked.  One server process blocks another if it either holds a lock
+    that conflicts with the blocked process's lock request (hard block), or is
+    waiting for a lock that would conflict with the blocked process's lock
+    request and is ahead of it in the wait queue (soft block).  When using
+    parallel queries the result always lists client-visible process IDs (that
+    is, <function>pg_backend_pid</> results) even if the actual lock is held
+    or awaited by a child worker process.  As a result of that, there may be
+    duplicated PIDs in the result.  Also note that when a prepared transaction
+    holds a conflicting lock, it will be represented by a zero process ID in
+    the result of this function.
+    Frequent calls to this function could have some impact on database
+    performance, because it needs exclusive access to the lock manager's
+    shared state for a short time.
+   </para>
+
    <indexterm>
     <primary>pg_conf_load_time</primary>
    </indexterm>
index 91218d0e56b11f7f11b7ad5409773210f355d9c1..97e8962ae81026a5742c27b4306ec3e058206d22 100644 (file)
@@ -2312,6 +2312,29 @@ HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids)
  */
 PGPROC *
 BackendPidGetProc(int pid)
+{
+       PGPROC     *result;
+
+       if (pid == 0)                           /* never match dummy PGPROCs */
+               return NULL;
+
+       LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+       result = BackendPidGetProcWithLock(pid);
+
+       LWLockRelease(ProcArrayLock);
+
+       return result;
+}
+
+/*
+ * BackendPidGetProcWithLock -- get a backend's PGPROC given its PID
+ *
+ * Same as above, except caller must be holding ProcArrayLock.  The found
+ * entry, if any, can be assumed to be valid as long as the lock remains held.
+ */
+PGPROC *
+BackendPidGetProcWithLock(int pid)
 {
        PGPROC     *result = NULL;
        ProcArrayStruct *arrayP = procArray;
@@ -2320,8 +2343,6 @@ BackendPidGetProc(int pid)
        if (pid == 0)                           /* never match dummy PGPROCs */
                return NULL;
 
-       LWLockAcquire(ProcArrayLock, LW_SHARED);
-
        for (index = 0; index < arrayP->numProcs; index++)
        {
                PGPROC     *proc = &allProcs[arrayP->pgprocnos[index]];
@@ -2333,8 +2354,6 @@ BackendPidGetProc(int pid)
                }
        }
 
-       LWLockRelease(ProcArrayLock);
-
        return result;
 }
 
index fef59a280a6c40ae47090eab1aaafb17ad172e2e..a458c68b9e9dec78c962d6291e381d5a13b20b2d 100644 (file)
@@ -21,7 +21,7 @@
  *
  *     Interface:
  *
- *     InitLocks(), GetLocksMethodTable(),
+ *     InitLocks(), GetLocksMethodTable(), GetLockTagsMethodTable(),
  *     LockAcquire(), LockRelease(), LockReleaseAll(),
  *     LockCheckConflicts(), GrantLock()
  *
@@ -41,6 +41,7 @@
 #include "pg_trace.h"
 #include "pgstat.h"
 #include "storage/proc.h"
+#include "storage/procarray.h"
 #include "storage/sinvaladt.h"
 #include "storage/spin.h"
 #include "storage/standby.h"
@@ -356,6 +357,8 @@ static void CleanUpLock(LOCK *lock, PROCLOCK *proclock,
 static void LockRefindAndRelease(LockMethod lockMethodTable, PGPROC *proc,
                                         LOCKTAG *locktag, LOCKMODE lockmode,
                                         bool decrement_strong_lock_count);
+static void GetSingleProcBlockerStatusData(PGPROC *blocked_proc,
+                                                          BlockedProcsData *data);
 
 
 /*
@@ -462,6 +465,18 @@ GetLocksMethodTable(const LOCK *lock)
        return LockMethods[lockmethodid];
 }
 
+/*
+ * Fetch the lock method table associated with a given locktag
+ */
+LockMethod
+GetLockTagsMethodTable(const LOCKTAG *locktag)
+{
+       LOCKMETHODID lockmethodid = (LOCKMETHODID) locktag->locktag_lockmethodid;
+
+       Assert(0 < lockmethodid && lockmethodid < lengthof(LockMethods));
+       return LockMethods[lockmethodid];
+}
+
 
 /*
  * Compute the hash code associated with a LOCKTAG.
@@ -3406,7 +3421,10 @@ GetLockStatusData(void)
         * impractical (in particular, note MAX_SIMUL_LWLOCKS).  It shouldn't
         * matter too much, because none of these locks can be involved in lock
         * conflicts anyway - anything that might must be present in the main lock
-        * table.
+        * table.  (For the same reason, we don't sweat about making leaderPid
+        * completely valid.  We cannot safely dereference another backend's
+        * lockGroupLeader field without holding all lock partition locks, and
+        * it's not worth that.)
         */
        for (i = 0; i < ProcGlobal->allProcCount; ++i)
        {
@@ -3439,6 +3457,7 @@ GetLockStatusData(void)
                        instance->backend = proc->backendId;
                        instance->lxid = proc->lxid;
                        instance->pid = proc->pid;
+                       instance->leaderPid = proc->pid;
                        instance->fastpath = true;
 
                        el++;
@@ -3466,6 +3485,7 @@ GetLockStatusData(void)
                        instance->backend = proc->backendId;
                        instance->lxid = proc->lxid;
                        instance->pid = proc->pid;
+                       instance->leaderPid = proc->pid;
                        instance->fastpath = true;
 
                        el++;
@@ -3517,6 +3537,7 @@ GetLockStatusData(void)
                instance->backend = proc->backendId;
                instance->lxid = proc->lxid;
                instance->pid = proc->pid;
+               instance->leaderPid = proclock->groupLeader->pid;
                instance->fastpath = false;
 
                el++;
@@ -3537,6 +3558,197 @@ GetLockStatusData(void)
        return data;
 }
 
+/*
+ * GetBlockerStatusData - Return a summary of the lock manager's state
+ * concerning locks that are blocking the specified PID or any member of
+ * the PID's lock group, for use in a user-level reporting function.
+ *
+ * For each PID within the lock group that is awaiting some heavyweight lock,
+ * the return data includes an array of LockInstanceData objects, which are
+ * the same data structure used by GetLockStatusData; but unlike that function,
+ * this one reports only the PROCLOCKs associated with the lock that that PID
+ * is blocked on.  (Hence, all the locktags should be the same for any one
+ * blocked PID.)  In addition, we return an array of the PIDs of those backends
+ * that are ahead of the blocked PID in the lock's wait queue.  These can be
+ * compared with the PIDs in the LockInstanceData objects to determine which
+ * waiters are ahead of or behind the blocked PID in the queue.
+ *
+ * If blocked_pid isn't a valid backend PID or nothing in its lock group is
+ * waiting on any heavyweight lock, return empty arrays.
+ *
+ * The design goal is to hold the LWLocks for as short a time as possible;
+ * thus, this function simply makes a copy of the necessary data and releases
+ * the locks, allowing the caller to contemplate and format the data for as
+ * long as it pleases.
+ */
+BlockedProcsData *
+GetBlockerStatusData(int blocked_pid)
+{
+       BlockedProcsData *data;
+       PGPROC     *proc;
+       int                     i;
+
+       data = (BlockedProcsData *) palloc(sizeof(BlockedProcsData));
+
+       /*
+        * Guess how much space we'll need, and preallocate.  Most of the time
+        * this will avoid needing to do repalloc while holding the LWLocks.  (We
+        * assume, but check with an Assert, that MaxBackends is enough entries
+        * for the procs[] array; the other two could need enlargement, though.)
+        */
+       data->nprocs = data->nlocks = data->npids = 0;
+       data->maxprocs = data->maxlocks = data->maxpids = MaxBackends;
+       data->procs = (BlockedProcData *) palloc(sizeof(BlockedProcData) * data->maxprocs);
+       data->locks = (LockInstanceData *) palloc(sizeof(LockInstanceData) * data->maxlocks);
+       data->waiter_pids = (int *) palloc(sizeof(int) * data->maxpids);
+
+       /*
+        * In order to search the ProcArray for blocked_pid and assume that that
+        * entry won't immediately disappear under us, we must hold ProcArrayLock.
+        * In addition, to examine the lock grouping fields of any other backend,
+        * we must hold all the hash partition locks.  (Only one of those locks is
+        * actually relevant for any one lock group, but we can't know which one
+        * ahead of time.)      It's fairly annoying to hold all those locks
+        * throughout this, but it's no worse than GetLockStatusData(), and it
+        * does have the advantage that we're guaranteed to return a
+        * self-consistent instantaneous state.
+        */
+       LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+       proc = BackendPidGetProcWithLock(blocked_pid);
+
+       /* Nothing to do if it's gone */
+       if (proc != NULL)
+       {
+               /*
+                * Acquire lock on the entire shared lock data structure.  See notes
+                * in GetLockStatusData().
+                */
+               for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
+                       LWLockAcquire(LockHashPartitionLockByIndex(i), LW_SHARED);
+
+               if (proc->lockGroupLeader == NULL)
+               {
+                       /* Easy case, proc is not a lock group member */
+                       GetSingleProcBlockerStatusData(proc, data);
+               }
+               else
+               {
+                       /* Examine all procs in proc's lock group */
+                       dlist_iter      iter;
+
+                       dlist_foreach(iter, &proc->lockGroupLeader->lockGroupMembers)
+                       {
+                               PGPROC     *memberProc;
+
+                               memberProc = dlist_container(PGPROC, lockGroupLink, iter.cur);
+                               GetSingleProcBlockerStatusData(memberProc, data);
+                       }
+               }
+
+               /*
+                * And release locks.  See notes in GetLockStatusData().
+                */
+               for (i = NUM_LOCK_PARTITIONS; --i >= 0;)
+                       LWLockRelease(LockHashPartitionLockByIndex(i));
+
+               Assert(data->nprocs <= data->maxprocs);
+       }
+
+       LWLockRelease(ProcArrayLock);
+
+       return data;
+}
+
+/* Accumulate data about one possibly-blocked proc for GetBlockerStatusData */
+static void
+GetSingleProcBlockerStatusData(PGPROC *blocked_proc, BlockedProcsData *data)
+{
+       LOCK       *theLock = blocked_proc->waitLock;
+       BlockedProcData *bproc;
+       SHM_QUEUE  *procLocks;
+       PROCLOCK   *proclock;
+       PROC_QUEUE *waitQueue;
+       PGPROC     *proc;
+       int                     queue_size;
+       int                     i;
+
+       /* Nothing to do if this proc is not blocked */
+       if (theLock == NULL)
+               return;
+
+       /* Set up a procs[] element */
+       bproc = &data->procs[data->nprocs++];
+       bproc->pid = blocked_proc->pid;
+       bproc->first_lock = data->nlocks;
+       bproc->first_waiter = data->npids;
+
+       /*
+        * We may ignore the proc's fast-path arrays, since nothing in those could
+        * be related to a contended lock.
+        */
+
+       /* Collect all PROCLOCKs associated with theLock */
+       procLocks = &(theLock->procLocks);
+       proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
+                                                                                offsetof(PROCLOCK, lockLink));
+       while (proclock)
+       {
+               PGPROC     *proc = proclock->tag.myProc;
+               LOCK       *lock = proclock->tag.myLock;
+               LockInstanceData *instance;
+
+               if (data->nlocks >= data->maxlocks)
+               {
+                       data->maxlocks += MaxBackends;
+                       data->locks = (LockInstanceData *)
+                               repalloc(data->locks, sizeof(LockInstanceData) * data->maxlocks);
+               }
+
+               instance = &data->locks[data->nlocks];
+               memcpy(&instance->locktag, &lock->tag, sizeof(LOCKTAG));
+               instance->holdMask = proclock->holdMask;
+               if (proc->waitLock == lock)
+                       instance->waitLockMode = proc->waitLockMode;
+               else
+                       instance->waitLockMode = NoLock;
+               instance->backend = proc->backendId;
+               instance->lxid = proc->lxid;
+               instance->pid = proc->pid;
+               instance->leaderPid = proclock->groupLeader->pid;
+               instance->fastpath = false;
+               data->nlocks++;
+
+               proclock = (PROCLOCK *) SHMQueueNext(procLocks, &proclock->lockLink,
+                                                                                        offsetof(PROCLOCK, lockLink));
+       }
+
+       /* Enlarge waiter_pids[] if it's too small to hold all wait queue PIDs */
+       waitQueue = &(theLock->waitProcs);
+       queue_size = waitQueue->size;
+
+       if (queue_size > data->maxpids - data->npids)
+       {
+               data->maxpids = Max(data->maxpids + MaxBackends,
+                                                       data->npids + queue_size);
+               data->waiter_pids = (int *) repalloc(data->waiter_pids,
+                                                                                        sizeof(int) * data->maxpids);
+       }
+
+       /* Collect PIDs from the lock's wait queue, stopping at blocked_proc */
+       proc = (PGPROC *) waitQueue->links.next;
+       for (i = 0; i < queue_size; i++)
+       {
+               if (proc == blocked_proc)
+                       break;
+               data->waiter_pids[data->npids++] = proc->pid;
+               proc = (PGPROC *) proc->links.next;
+       }
+
+       bproc->num_locks = data->nlocks - bproc->first_lock;
+       bproc->num_waiters = data->npids - bproc->first_waiter;
+}
+
 /*
  * Returns a list of currently held AccessExclusiveLocks, for use by
  * LogStandbySnapshot().  The result is a palloc'd array,
index 73c78e9b2637ae3dad6cda56868540a2c83878ab..6bcab811f5e92f6c39ab439d3b3b25f7b0d4114a 100644 (file)
@@ -18,6 +18,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "storage/predicate_internals.h"
+#include "utils/array.h"
 #include "utils/builtins.h"
 
 
@@ -99,7 +100,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
                oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
 
                /* build tupdesc for result tuples */
-               /* this had better match pg_locks view in system_views.sql */
+               /* this had better match function's declaration in pg_proc.h */
                tupdesc = CreateTemplateTupleDesc(NUM_LOCK_STATUS_COLUMNS, false);
                TupleDescInitEntry(tupdesc, (AttrNumber) 1, "locktype",
                                                   TEXTOID, -1, 0);
@@ -394,6 +395,128 @@ pg_lock_status(PG_FUNCTION_ARGS)
 }
 
 
+/*
+ * pg_blocking_pids - produce an array of the PIDs blocking given PID
+ *
+ * The reported PIDs are those that hold a lock conflicting with blocked_pid's
+ * current request (hard block), or are requesting such a lock and are ahead
+ * of blocked_pid in the lock's wait queue (soft block).
+ *
+ * In parallel-query cases, we report all PIDs blocking any member of the
+ * given PID's lock group, and the reported PIDs are those of the blocking
+ * PIDs' lock group leaders.  This allows callers to compare the result to
+ * lists of clients' pg_backend_pid() results even during a parallel query.
+ *
+ * Parallel query makes it possible for there to be duplicate PIDs in the
+ * result (either because multiple waiters are blocked by same PID, or
+ * because multiple blockers have same group leader PID).  We do not bother
+ * to eliminate such duplicates from the result.
+ *
+ * We need not consider predicate locks here, since those don't block anything.
+ */
+Datum
+pg_blocking_pids(PG_FUNCTION_ARGS)
+{
+       int                     blocked_pid = PG_GETARG_INT32(0);
+       Datum      *arrayelems;
+       int                     narrayelems;
+       BlockedProcsData *lockData; /* state data from lmgr */
+       int                     i,
+                               j;
+
+       /* Collect a snapshot of lock manager state */
+       lockData = GetBlockerStatusData(blocked_pid);
+
+       /* We can't need more output entries than there are reported PROCLOCKs */
+       arrayelems = (Datum *) palloc(lockData->nlocks * sizeof(Datum));
+       narrayelems = 0;
+
+       /* For each blocked proc in the lock group ... */
+       for (i = 0; i < lockData->nprocs; i++)
+       {
+               BlockedProcData *bproc = &lockData->procs[i];
+               LockInstanceData *instances = &lockData->locks[bproc->first_lock];
+               int                *preceding_waiters = &lockData->waiter_pids[bproc->first_waiter];
+               LockInstanceData *blocked_instance;
+               LockMethod      lockMethodTable;
+               int                     conflictMask;
+
+               /*
+                * Locate the blocked proc's own entry in the LockInstanceData array.
+                * There should be exactly one matching entry.
+                */
+               blocked_instance = NULL;
+               for (j = 0; j < bproc->num_locks; j++)
+               {
+                       LockInstanceData *instance = &(instances[j]);
+
+                       if (instance->pid == bproc->pid)
+                       {
+                               Assert(blocked_instance == NULL);
+                               blocked_instance = instance;
+                       }
+               }
+               Assert(blocked_instance != NULL);
+
+               lockMethodTable = GetLockTagsMethodTable(&(blocked_instance->locktag));
+               conflictMask = lockMethodTable->conflictTab[blocked_instance->waitLockMode];
+
+               /* Now scan the PROCLOCK data for conflicting procs */
+               for (j = 0; j < bproc->num_locks; j++)
+               {
+                       LockInstanceData *instance = &(instances[j]);
+
+                       /* A proc never blocks itself, so ignore that entry */
+                       if (instance == blocked_instance)
+                               continue;
+                       /* Members of same lock group never block each other, either */
+                       if (instance->leaderPid == blocked_instance->leaderPid)
+                               continue;
+
+                       if (conflictMask & instance->holdMask)
+                       {
+                               /* hard block: blocked by lock already held by this entry */
+                       }
+                       else if (instance->waitLockMode != NoLock &&
+                                        (conflictMask & LOCKBIT_ON(instance->waitLockMode)))
+                       {
+                               /* conflict in lock requests; who's in front in wait queue? */
+                               bool            ahead = false;
+                               int                     k;
+
+                               for (k = 0; k < bproc->num_waiters; k++)
+                               {
+                                       if (preceding_waiters[k] == instance->pid)
+                                       {
+                                               /* soft block: this entry is ahead of blocked proc */
+                                               ahead = true;
+                                               break;
+                                       }
+                               }
+                               if (!ahead)
+                                       continue;       /* not blocked by this entry */
+                       }
+                       else
+                       {
+                               /* not blocked by this entry */
+                               continue;
+                       }
+
+                       /* blocked by this entry, so emit a record */
+                       arrayelems[narrayelems++] = Int32GetDatum(instance->leaderPid);
+               }
+       }
+
+       /* Assert we didn't overrun arrayelems[] */
+       Assert(narrayelems <= lockData->nlocks);
+
+       /* Construct array, using hardwired knowledge about int4 type */
+       PG_RETURN_ARRAYTYPE_P(construct_array(arrayelems, narrayelems,
+                                                                                 INT4OID,
+                                                                                 sizeof(int32), true, 'i'));
+}
+
+
 /*
  * Functions for manipulating advisory locks
  *
index 8687abb97e70412c6150b5e8049f7fa8ddffdb9a..aff12d353c382716ffb5eb48e25e1e16ec39f68d 100644 (file)
@@ -53,6 +53,6 @@
  */
 
 /*                                                     yyyymmddN */
-#define CATALOG_VERSION_NO     201602201
+#define CATALOG_VERSION_NO     201602221
 
 #endif
index 59c50d93427f75aab2d538d315afd1b57d2216c9..62b91252dc5c30963206c478297be7589ea9f786 100644 (file)
@@ -3012,6 +3012,8 @@ DATA(insert OID = 3329 (  pg_show_all_file_settings PGNSP PGUID 12 1 1000 0 0 f
 DESCR("show config file settings");
 DATA(insert OID = 1371 (  pg_lock_status   PGNSP PGUID 12 1 1000 0 0 f f f f t t v s 0 0 2249 "" "{25,26,26,23,21,25,28,26,26,21,25,23,25,16,16}" "{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}" "{locktype,database,relation,page,tuple,virtualxid,transactionid,classid,objid,objsubid,virtualtransaction,pid,mode,granted,fastpath}" _null_ _null_ pg_lock_status _null_ _null_ _null_ ));
 DESCR("view system lock information");
+DATA(insert OID = 2561 (  pg_blocking_pids PGNSP PGUID 12 1 0 0 0 f f f f t f v s 1 0 1007 "23" _null_ _null_ _null_ _null_ _null_ pg_blocking_pids _null_ _null_ _null_ ));
+DESCR("get array of PIDs of sessions blocking specified backend PID");
 DATA(insert OID = 1065 (  pg_prepared_xact PGNSP PGUID 12 1 1000 0 0 f f f f t t v s 0 0 2249 "" "{28,25,1184,26,26}" "{o,o,o,o,o}" "{transaction,gid,prepared,ownerid,dbid}" _null_ _null_ pg_prepared_xact _null_ _null_ _null_ ));
 DESCR("view two-phase transactions");
 DATA(insert OID = 3819 (  pg_get_multixact_members PGNSP PGUID 12 1 1000 0 0 f f f f t t v s 1 0 2249 "28" "{28,28,25}" "{i,o,o}" "{multixid,xid,mode}" _null_ _null_ pg_get_multixact_members _null_ _null_ _null_ ));
index 703eaf2de192fed578212f54ddfb6195c5a871da..788d50a35f3c6ad965b3edcd3e9cdb44203060a9 100644 (file)
@@ -346,7 +346,7 @@ typedef struct PROCLOCK
        PROCLOCKTAG tag;                        /* unique identifier of proclock object */
 
        /* data */
-       PGPROC     *groupLeader;        /* group leader, or NULL if no lock group */
+       PGPROC     *groupLeader;        /* proc's lock group leader, or proc itself */
        LOCKMASK        holdMask;               /* bitmask for lock types currently held */
        LOCKMASK        releaseMask;    /* bitmask for lock types to be released */
        SHM_QUEUE       lockLink;               /* list link in LOCK's list of proclocks */
@@ -423,21 +423,48 @@ typedef struct LOCALLOCK
 
 typedef struct LockInstanceData
 {
-       LOCKTAG         locktag;                /* locked object */
+       LOCKTAG         locktag;                /* tag for locked object */
        LOCKMASK        holdMask;               /* locks held by this PGPROC */
        LOCKMODE        waitLockMode;   /* lock awaited by this PGPROC, if any */
        BackendId       backend;                /* backend ID of this PGPROC */
        LocalTransactionId lxid;        /* local transaction ID of this PGPROC */
        int                     pid;                    /* pid of this PGPROC */
+       int                     leaderPid;              /* pid of group leader; = pid if no group */
        bool            fastpath;               /* taken via fastpath? */
 } LockInstanceData;
 
 typedef struct LockData
 {
        int                     nelements;              /* The length of the array */
-       LockInstanceData *locks;
+       LockInstanceData *locks;        /* Array of per-PROCLOCK information */
 } LockData;
 
+typedef struct BlockedProcData
+{
+       int                     pid;                    /* pid of a blocked PGPROC */
+       /* Per-PROCLOCK information about PROCLOCKs of the lock the pid awaits */
+       /* (these fields refer to indexes in BlockedProcsData.locks[]) */
+       int                     first_lock;             /* index of first relevant LockInstanceData */
+       int                     num_locks;              /* number of relevant LockInstanceDatas */
+       /* PIDs of PGPROCs that are ahead of "pid" in the lock's wait queue */
+       /* (these fields refer to indexes in BlockedProcsData.waiter_pids[]) */
+       int                     first_waiter;   /* index of first preceding waiter */
+       int                     num_waiters;    /* number of preceding waiters */
+} BlockedProcData;
+
+typedef struct BlockedProcsData
+{
+       BlockedProcData *procs;         /* Array of per-blocked-proc information */
+       LockInstanceData *locks;        /* Array of per-PROCLOCK information */
+       int                *waiter_pids;        /* Array of PIDs of other blocked PGPROCs */
+       int                     nprocs;                 /* # of valid entries in procs[] array */
+       int                     maxprocs;               /* Allocated length of procs[] array */
+       int                     nlocks;                 /* # of valid entries in locks[] array */
+       int                     maxlocks;               /* Allocated length of locks[] array */
+       int                     npids;                  /* # of valid entries in waiter_pids[] array */
+       int                     maxpids;                /* Allocated length of waiter_pids[] array */
+} BlockedProcsData;
+
 
 /* Result codes for LockAcquire() */
 typedef enum
@@ -489,6 +516,7 @@ typedef enum
  */
 extern void InitLocks(void);
 extern LockMethod GetLocksMethodTable(const LOCK *lock);
+extern LockMethod GetLockTagsMethodTable(const LOCKTAG *locktag);
 extern uint32 LockTagHashCode(const LOCKTAG *locktag);
 extern bool DoLockModesConflict(LOCKMODE mode1, LOCKMODE mode2);
 extern LockAcquireResult LockAcquire(const LOCKTAG *locktag,
@@ -521,6 +549,7 @@ extern void GrantAwaitedLock(void);
 extern void RemoveFromWaitQueue(PGPROC *proc, uint32 hashcode);
 extern Size LockShmemSize(void);
 extern LockData *GetLockStatusData(void);
+extern BlockedProcsData *GetBlockerStatusData(int blocked_pid);
 
 extern xl_standby_lock *GetRunningTransactionLocks(int *nlocks);
 extern const char *GetLockmodeName(LOCKMETHODID lockmethodid, LOCKMODE mode);
index 1fbf4f3593b079d0152a0d854b7c1d3e3bbf81f6..dd37c0cb07086fc916fb731ca6ecc150b7d290de 100644 (file)
@@ -61,6 +61,7 @@ extern VirtualTransactionId *GetVirtualXIDsDelayingChkpt(int *nvxids);
 extern bool HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids);
 
 extern PGPROC *BackendPidGetProc(int pid);
+extern PGPROC *BackendPidGetProcWithLock(int pid);
 extern int     BackendXidGetPid(TransactionId xid);
 extern bool IsBackendPid(int pid);
 
index 94c188163a7435e156b59053063ce936840eaa05..7ec93c95c7c116064beeebbba381c7414edc8acd 100644 (file)
@@ -1157,6 +1157,7 @@ extern Datum row_security_active_name(PG_FUNCTION_ARGS);
 
 /* lockfuncs.c */
 extern Datum pg_lock_status(PG_FUNCTION_ARGS);
+extern Datum pg_blocking_pids(PG_FUNCTION_ARGS);
 extern Datum pg_advisory_lock_int8(PG_FUNCTION_ARGS);
 extern Datum pg_advisory_xact_lock_int8(PG_FUNCTION_ARGS);
 extern Datum pg_advisory_lock_shared_int8(PG_FUNCTION_ARGS);
index 0a9d25ce9ca12617683cdd4d1e2161eb7263948e..6461ae8f81534fb00ef2d93593367af13c614bf0 100644 (file)
@@ -227,27 +227,12 @@ main(int argc, char **argv)
         */
        initPQExpBuffer(&wait_query);
        appendPQExpBufferStr(&wait_query,
-                                                "SELECT 1 FROM pg_locks holder, pg_locks waiter "
-                                                "WHERE NOT waiter.granted AND waiter.pid = $1 "
-                                                "AND holder.granted "
-                                                "AND holder.pid <> $1 AND holder.pid IN (");
+                                                "SELECT pg_catalog.pg_blocking_pids($1) && '{");
        /* The spec syntax requires at least one session; assume that here. */
        appendPQExpBufferStr(&wait_query, backend_pids[1]);
        for (i = 2; i < nconns; i++)
-               appendPQExpBuffer(&wait_query, ", %s", backend_pids[i]);
-       appendPQExpBufferStr(&wait_query,
-                                                ") "
-
-                                 "AND holder.locktype IS NOT DISTINCT FROM waiter.locktype "
-                                 "AND holder.database IS NOT DISTINCT FROM waiter.database "
-                                 "AND holder.relation IS NOT DISTINCT FROM waiter.relation "
-                                                "AND holder.page IS NOT DISTINCT FROM waiter.page "
-                                                "AND holder.tuple IS NOT DISTINCT FROM waiter.tuple "
-                         "AND holder.virtualxid IS NOT DISTINCT FROM waiter.virtualxid "
-               "AND holder.transactionid IS NOT DISTINCT FROM waiter.transactionid "
-                                       "AND holder.classid IS NOT DISTINCT FROM waiter.classid "
-                                                "AND holder.objid IS NOT DISTINCT FROM waiter.objid "
-                               "AND holder.objsubid IS NOT DISTINCT FROM waiter.objsubid ");
+               appendPQExpBuffer(&wait_query, ",%s", backend_pids[i]);
+       appendPQExpBufferStr(&wait_query, "}'::integer[]");
 
        res = PQprepare(conns[0], PREP_WAITING, wait_query.data, 0, NULL);
        if (PQresultStatus(res) != PGRES_COMMAND_OK)
@@ -745,21 +730,22 @@ try_complete_step(Step *step, int flags)
                        /* If it's OK for the step to block, check whether it has. */
                        if (flags & STEP_NONBLOCK)
                        {
-                               int                     ntuples;
+                               bool            waiting;
 
                                res = PQexecPrepared(conns[0], PREP_WAITING, 1,
                                                                         &backend_pids[step->session + 1],
                                                                         NULL, NULL, 0);
-                               if (PQresultStatus(res) != PGRES_TUPLES_OK)
+                               if (PQresultStatus(res) != PGRES_TUPLES_OK ||
+                                       PQntuples(res) != 1)
                                {
                                        fprintf(stderr, "lock wait query failed: %s",
                                                        PQerrorMessage(conn));
                                        exit_nicely();
                                }
-                               ntuples = PQntuples(res);
+                               waiting = ((PQgetvalue(res, 0, 0))[0] == 't');
                                PQclear(res);
 
-                               if (ntuples >= 1)               /* waiting to acquire a lock */
+                               if (waiting)    /* waiting to acquire a lock */
                                {
                                        if (!(flags & STEP_RETRY))
                                                printf("step %s: %s <waiting ...>\n",