]> granicus.if.org Git - postgresql/commit
Use perfect hashing, instead of binary search, for keyword lookup.
authorTom Lane <tgl@sss.pgh.pa.us>
Thu, 10 Jan 2019 00:47:38 +0000 (19:47 -0500)
committerTom Lane <tgl@sss.pgh.pa.us>
Thu, 10 Jan 2019 00:47:46 +0000 (19:47 -0500)
commitc64d0cd5ce24a344798534f1bc5827a9199b7a6e
tree3968456d54c3f18d07976e5a139ca60589a8fbf0
parent5d59a6c5eaff4a58322683e450e76a11d943d322
Use perfect hashing, instead of binary search, for keyword lookup.

We've been speculating for a long time that hash-based keyword lookup
ought to be faster than binary search, but up to now we hadn't found
a suitable tool for generating the hash function.  Joerg Sonnenberger
provided the inspiration, and sample code, to show us that rolling our
own generator wasn't a ridiculous idea.  Hence, do that.

The method used here requires a lookup table of approximately 4 bytes
per keyword, but that's less than what we saved in the predecessor commit
afb0d0712, so it's not a big problem.  The time savings is indeed
significant: preliminary testing suggests that the total time for raw
parsing (flex + bison phases) drops by ~20%.

Patch by me, but it owes its existence to Joerg Sonnenberger;
thanks also to John Naylor for review.

Discussion: https://postgr.es/m/20190103163340.GA15803@britannica.bec.de
14 files changed:
src/common/Makefile
src/common/kwlookup.c
src/include/common/kwlookup.h
src/include/parser/kwlist.h
src/interfaces/ecpg/preproc/Makefile
src/interfaces/ecpg/preproc/c_keywords.c
src/interfaces/ecpg/preproc/c_kwlist.h
src/interfaces/ecpg/preproc/ecpg_kwlist.h
src/pl/plpgsql/src/Makefile
src/pl/plpgsql/src/pl_reserved_kwlist.h
src/pl/plpgsql/src/pl_unreserved_kwlist.h
src/tools/PerfectHash.pm [new file with mode: 0644]
src/tools/gen_keywordlist.pl
src/tools/msvc/Solution.pm