postgresql/src/include/executor
Tom Lane 1811f1af98 Improve hash join's handling of tuples with null join keys.
In a plain join, we can just summarily discard an input tuple
with null join key(s), since it cannot match anything from
the other side of the join (assuming a strict join operator).
However, if the tuple comes from the outer side of an outer join
then we have to emit it with null-extension of the other side.

Up to now, hash joins did that by inserting the tuple into the hash
table as though it were a normal tuple.  This is unnecessarily
inefficient though, since the required processing is far simpler than
for a potentially-matchable tuple.  Worse, if there are a lot of such
tuples they will bloat the hash bucket they go into, possibly causing
useless repeated attempts to split that bucket or increase the number
of batches.  We have a report of a large join vainly creating many
thousands of batches when faced with such input.

This patch improves the situation by keeping such tuples out of the
hash table altogether, instead pushing them into a separate tuplestore
from which we return them later.  (One might consider trying to return
them immediately; but that would require substantial refactoring, and
it doesn't work anyway for cases where we rescan an unmodified hash
table.)  This works even in parallel hash joins, because whichever
worker reads a null-keyed tuple can just return it; there's no need
for consultation with other workers.  Thus the tuplestores are local
storage even in a parallel join.

A pre-existing buglet that I noticed while analyzing the code's
behavior is that ExecHashRemoveNextSkewBucket fails to decrement
hashtable->skewTuples for tuples moved into the main hash table
from the skew hash table.  This invalidates ExecHashTableInsert's
calculation of the number of main-hash-table tuples, though probably
not by a lot since we expect the skew table to be small relative
to the main one.  Nonetheless, let's fix that too while we're here.

Bug: #18909
Reported-by: Sergey Koposov <Sergey.Koposov@ed.ac.uk>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/3061845.1746486714@sss.pgh.pa.us
2026-03-19 15:21:36 -04:00
..
execAsync.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
execdebug.h Adjust style of some debugging macros. 2026-02-06 16:24:21 -06:00
execdesc.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
execExpr.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
execParallel.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
execPartition.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
execScan.h Reduce header inclusions via execnodes.h 2026-03-16 14:34:57 +01:00
executor.h Improve hash join's handling of tuples with null join keys. 2026-03-19 15:21:36 -04:00
functions.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
hashjoin.h Improve hash join's handling of tuples with null join keys. 2026-03-19 15:21:36 -04:00
instrument.h instrumentation: Keep time fields as instrtime, convert in callers 2026-01-09 13:38:00 -05:00
instrument_node.h Move instrumentation-related structs to instrument_node.h 2026-01-12 16:59:28 +01:00
nodeAgg.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeAppend.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeBitmapAnd.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeBitmapHeapscan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeBitmapIndexscan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeBitmapOr.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeCtescan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeCustom.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeForeignscan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeFunctionscan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeGather.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeGatherMerge.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeGroup.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeHash.h Improve hash join's handling of tuples with null join keys. 2026-03-19 15:21:36 -04:00
nodeHashjoin.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeIncrementalSort.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeIndexonlyscan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeIndexscan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeLimit.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeLockRows.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeMaterial.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeMemoize.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeMergeAppend.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeMergejoin.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeModifyTable.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeNamedtuplestorescan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeNestloop.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeProjectSet.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeRecursiveunion.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeResult.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeSamplescan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeSeqscan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeSetOp.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeSort.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeSubplan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeSubqueryscan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeTableFuncscan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeTidrangescan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeTidscan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeUnique.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeValuesscan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeWindowAgg.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
nodeWorktablescan.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
spi.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
spi_priv.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
tablefunc.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
tqueue.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
tstoreReceiver.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
tuptable.h Optimize tuple deformation 2026-03-16 11:46:00 +13:00