granicus.if.org Git - postgresql/commit

author	Tom Lane <tgl@sss.pgh.pa.us>
	Tue, 16 Jul 2019 16:04:06 +0000 (12:04 -0400)
committer	Tom Lane <tgl@sss.pgh.pa.us>
	Tue, 16 Jul 2019 16:04:06 +0000 (12:04 -0400)
commit	2f5b8eb5a28b4e6de9d20cc7d2c6028c6c7a8aa8
tree	5e6d2f0a67adc23d27ce7445aaad08c28d9fe22e	tree \| snapshot
parent	569ed7f48312c70ed4a79daec1d7688fda4e74ac	commit \| diff

Clean up some ad-hoc code for sorting and de-duplicating Lists.

heap.c and relcache.c contained nearly identical copies of logic
to insert OIDs into an OID list while preserving the list's OID
ordering (and rejecting duplicates, in one case but not the other).

The comments argue that this is faster than qsort for small numbers
of OIDs, which is at best unproven, and seems even less likely to be
true now that lappend_cell_oid has to move data around.  In any case
it's ugly and hard-to-follow code, and if we do have a lot of OIDs
to consider, it's O(N^2).

Hence, replace with simply lappend'ing OIDs to a List, then list_sort
the completed List, then remove adjacent duplicates if necessary.
This is demonstrably O(N log N) and it's much simpler for the
callers.  It's possible that this would be somewhat inefficient
if there were a very large number of duplicates, but that seems
unlikely in the existing usage.

This adds list_deduplicate_oid and list_oid_cmp infrastructure
to list.c.  I didn't bother with equivalent functionality for
integer or pointer Lists, but such could always be added later
if we find a use for it.

Discussion: https://postgr.es/m/26193.1563228600@sss.pgh.pa.us

src/backend/catalog/heap.c		diff \| blob \| history
src/backend/nodes/list.c		diff \| blob \| history
src/backend/utils/cache/relcache.c		diff \| blob \| history
src/include/nodes/pg_list.h		diff \| blob \| history