granicus.if.org Git - postgresql/commit

author	Tomas Vondra <tomas.vondra@postgresql.org>
	Thu, 4 Jul 2019 21:02:02 +0000 (23:02 +0200)
committer	Tomas Vondra <tomas.vondra@postgresql.org>
	Thu, 4 Jul 2019 23:32:33 +0000 (01:32 +0200)
commit	e365a581c246a8e18f38cc530013391329dcdb02
tree	e7e4375ab9fdcae188a756318a6d72e9075e3517	tree \| snapshot
parent	d5ab9df7774b4570ff50e64b7fa3ba8295596d06	commit \| diff

Speed-up build of MCV lists with many distinct values

When building multi-column MCV lists, we compute base frequency for each
item, i.e. a product of per-column frequencies for values from the item.
As a value may be in multiple groups, the code was scanning the whole
array of groups while adding items to the MCV list. This works fine as
long as the number of distinct groups is small, but it's easy to trigger
trigger O(N^2) behavior, especially after increasing statistics target.

This commit precomputes frequencies for values in all columns, so that
when computing the base frequency it's enough to make a simple bsearch
lookup in the array.

Backpatch to 12, where multi-column MCV lists were introduced.

Discussion: https://postgr.es/m/20190618205920.qtlzcu73whfpfqne@development