granicus.if.org Git - postgresql/commit

author	Neil Conway <neilc@samurai.com>
	Wed, 15 Jun 2005 07:27:44 +0000 (07:27 +0000)
committer	Neil Conway <neilc@samurai.com>
	Wed, 15 Jun 2005 07:27:44 +0000 (07:27 +0000)
commit	c119c5bd49baa424480bd9e8f9dda69a09f5a572
tree	c9466886d6e4b74506ce7f6fe190efb34d63dd2e	tree \| snapshot
parent	4aaff553597222467769dd3b26e0d56c9c4a9b09	commit \| diff

Change the implementation of hash join to attempt to avoid unnecessary
work if either of the join relations are empty. The logic is:

(1) if the inner relation's startup cost is less than the outer
    relation's startup cost and this is not an outer join, read
    a single tuple from the inner relation via ExecHash()
      - if NULL, we're done

(2) read a single tuple from the outer relation
      - if NULL, we're done

(3) build the hash table on the inner relation
      - if hash table is empty and this is not an outer join,
        we're done

(4) otherwise, do hash join as usual

The implementation uses the new MultiExecProcNode API, per a
suggestion from Tom: invoking ExecHash() now produces the first
tuple from the Hash node's child node, whereas MultiExecHash()
builds the hash table.

I had to put in a bit of a kludge to get the row count returned
for EXPLAIN ANALYZE to be correct: since ExecHash() is invoked to
return a tuple, and then MultiExecHash() is invoked, we would
return one too many tuples to EXPLAIN ANALYZE. I hacked around
this by just manually detecting this situation and subtracting 1
from the EXPLAIN ANALYZE row count.

src/backend/executor/nodeHash.c		diff \| blob \| history
src/backend/executor/nodeHashjoin.c		diff \| blob \| history
src/include/nodes/execnodes.h		diff \| blob \| history