postgresql/src/backend
Tomas Vondra b85c4700fc Fix hashjoin memory balancing logic
Commit a1b4f289be improved the hashjoin sizing to also consider the
memory used by BufFiles for batches. The code however had multiple
issues, making it ineffective or not working as expected in some cases.

* The amount of memory needed by buffers was calculated using uint32,
  so it would overflow for nbatch >= 262144. If this happened the loop
  would exit prematurely and the memory usage would not be reduced.

  The nbatch overflow is fixed by reworking the condition to not use a
  multiplication at all, so there's no risk of overflow. An explicit
  cast was added to a similar calculation in ExecHashIncreaseBatchSize.

* The loop adjusting the nbatch value used hash_table_bytes to calculate
  the old/new size, but then updated only space_allowed. The consequence
  is the total memory usage was not reduced, but all the memory saved by
  reducing the number of batches was used for the internal hash table.

  This was fixed by using only space_allowed. This is also more correct,
  because hash_table_bytes does not account for skew buckets.

* The code was also doubling multiple parameters (e.g. the number of
  buckets for hash table), but was missing overflow protections.

  The loop now checks for overflow, and terminates if needed. It'd be
  possible to cap the value and continue the loop, but it's not worth
  the complexity. And the overflow implies the in-memory hash table is
  already very large anyway.

While at it, rework the comment explaining how the memory balancing
works, to make it more concise and easier to understand.

The initial nbatch overflow issue was reported by Vaibhav Jain. The
other issues were noticed by me and Melanie Plageman. Fix by me, with a
lot of review and feedback by Melanie.

Backpatch to 18, where the hashjoin memory balancing was introduced.

Reported-by: Vaibhav Jain <jainva@google.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Backpatch-through: 18
Discussion: https://postgr.es/m/CABa-Az174YvfFq7rLS+VNKaQyg7inA2exvPWmPWqnEn6Ditr_Q@mail.gmail.com
2025-10-17 22:21:50 +02:00
..
access Add log_autoanalyze_min_duration 2025-10-15 14:31:12 +02:00
archive Update copyright for 2025 2025-01-01 11:21:55 -05:00
backup Don't include access/htup_details.h in executor/tuptable.h 2025-10-05 18:00:38 +02:00
bootstrap Add new OID alias type regdatabase. 2025-06-30 15:38:54 -05:00
catalog Refactor logical worker synchronization code into a separate file. 2025-10-16 05:10:50 +00:00
commands Fix lookup code for REINDEX INDEX. 2025-10-15 16:32:40 -05:00
executor Fix hashjoin memory balancing logic 2025-10-17 22:21:50 +02:00
foreign Track the number of presorted outer pathkeys in MergePath 2025-05-08 18:21:32 +09:00
jit jit: Fix type used for Datum values in LLVM IR. 2025-09-17 13:38:35 +12:00
lib Replace defunct URL with stable archive.org URL in rbtree.c 2025-10-17 09:38:49 +02:00
libpq Remove hbaPort type 2025-09-15 11:04:10 +02:00
main Force LC_COLLATE to C in postmaster. 2025-07-16 14:13:18 -07:00
nodes Fix internal error from CollateExpr in SQL/JSON DEFAULT expressions 2025-10-09 01:07:59 -04:00
optimizer Remove partColsUpdated. 2025-10-16 11:31:38 -05:00
parser Standardize use of REFRESH PUBLICATION in code and messages. 2025-10-15 03:42:27 +00:00
partitioning Avoid leakage of zero-length arrays in partition_bounds_copy(). 2025-08-02 21:59:46 -04:00
po Translation updates 2025-05-05 12:04:49 +02:00
port Remove traces of support for Sun Studio compiler 2025-09-12 07:39:05 +02:00
postmaster Add log_autoanalyze_min_duration 2025-10-15 14:31:12 +02:00
regex Add pg_iswalpha() and related functions. 2025-10-15 12:54:01 -07:00
replication Refactor logical worker synchronization code into a separate file. 2025-10-16 05:10:50 +00:00
rewrite Don't generate fake "*TLOCRN*" or "*TROCRN*" aliases, either. 2025-09-08 12:58:07 -04:00
snowball Use PG_MODULE_MAGIC_EXT in our installable shared libraries. 2025-03-26 11:11:02 -04:00
statistics Fix lookups in pg_{clear,restore}_{attribute,relation}_stats(). 2025-10-15 12:47:33 -05:00
storage bufmgr: Fix valgrind checking for buffers pinned in StrategyGetBuffer() 2025-10-09 19:17:13 -04:00
tcop Add ExplainState argument to pg_plan_query() and planner(). 2025-10-08 08:33:29 -04:00
tsearch Track the maximum possible frequency of non-MCE array elements. 2025-09-20 14:48:16 -04:00
utils Change config_generic.vartype to be initialized at compile time 2025-10-17 10:33:54 +02:00
.gitignore
common.mk Blind attempt to fix LLVM dependency in the backend 2022-09-15 10:53:48 +07:00
Makefile aio: Add liburing dependency 2025-03-26 19:45:32 -04:00
meson.build meson: add and use stamp files for generated headers 2025-08-11 15:18:23 -04:00
nls.mk Create a separate file listing backend types 2025-09-26 15:21:49 +02:00