postgresql/src
David Rowley 75d8fe8181 Fix incorrect hash table resizing code in simplehash.h
This fixes a bug in simplehash.h which caused an incorrect size mask to be
used when the hash table grew to SH_MAX_SIZE (2^32).  The code was
incorrectly setting the size mask to 0 when the hash tables reached the
maximum possible number of buckets.  This would result always trying to
use the 0th bucket causing an  infinite loop of trying to grow the hash
table due to there being too many collisions.

Seemingly it's not that common for simplehash tables to ever grow this big
as this bug dates back to v10 and nobody seems to have noticed it before.
However, probably the most likely place that people would notice it would
be doing a large in-memory Hash Aggregate with something close to at least
2^31 groups.

After this fix, the code now works correctly with up to within 98% of 2^32
groups and will fail with the following error when trying to insert any
more items into the hash table:

ERROR:  hash table size exceeded

However, the work_mem (or hash_mem_multiplier in newer versions) settings
will generally cause Hash Aggregates to spill to disk long before reaching
that many groups.  The minimal test case I did took a work_mem setting of
over 192GB to hit the bug.

simplehash hash tables are used in a few other places such as Bitmap Index
Scans, however, again the size that the hash table can become there is
also limited to work_mem and it would take a relation of around 16TB
(2^31) pages and a very large work_mem setting to hit this.  With smaller
work_mem values the table would become lossy and never grow large enough
to hit the problem.

Author: Yura Sokolov
Reviewed-by: David Rowley, Ranier Vilela
Discussion: https://postgr.es/m/b1f7f32737c3438136f64b26f4852b96@postgrespro.ru
Backpatch-through: 10, where simplehash.h was added
2021-08-13 16:43:13 +12:00
..
backend Make EXEC_BACKEND more convenient on macOS. 2021-08-13 11:11:06 +12:00
bin Translation updates 2021-08-09 12:36:42 +02:00
common Fix command-line colorization on Windows with VT100-compatible environments 2020-03-02 15:46:24 +09:00
fe_utils Fix incautious handling of possibly-miscoded strings in client code. 2021-06-07 14:15:25 -04:00
include Fix incorrect hash table resizing code in simplehash.h 2021-08-13 16:43:13 +12:00
interfaces Stamp 12.8. 2021-08-09 16:50:41 -04:00
makefiles Select CFLAGS_SL at configure time, not in platform-specific Makefiles. 2019-10-21 12:32:35 -04:00
pl Fix corner-case uninitialized-variable issues in plpgsql. 2021-07-20 13:01:48 -04:00
port Stamp 12.8. 2021-08-09 16:50:41 -04:00
template Further tweaking of PG_SYSROOT heuristics for macOS. 2021-01-20 12:07:35 -05:00
test Really fix the ambiguity in REFRESH MATERIALIZED VIEW CONCURRENTLY. 2021-08-07 13:29:32 -04:00
timezone Update time zone data files to tzdata release 2021a. 2021-01-24 16:29:47 -05:00
tools Add fallback implementation for setenv() 2021-06-01 09:27:31 +09:00
tutorial tutorial: land height is "elevation", not "altitude" 2021-03-10 20:25:18 -05:00
.gitignore
DEVELOPERS
Makefile Fix partial-build problems introduced by having more generated headers. 2018-04-09 16:42:10 -04:00
Makefile.global.in Fix prove_installcheck to use correct paths when used with PGXS 2021-07-01 08:47:04 -04:00
Makefile.shlib Ensure static libraries have correct mod time even if ranlib messes it up. 2018-11-29 15:53:44 -05:00
nls-global.mk NLS: Fix backend gettext triggers 2019-09-23 09:05:50 +02:00