postgresql/contrib/bloom
Tom Lane 733f20df53 Discount the metapage when estimating number of index pages visited.
genericcostestimate() estimates the number of index leaf pages to
be visited as a pro-rata fraction of the total number of leaf pages.
Or at least that was the intention.  What it actually used in the
calculation was the total number of index pages, so that non-leaf
pages were also counted.  In a decent-sized index the error is
probably small, since we expect upper page fanout to be high.
But in a small index that's not true; in the worst case with one
data-bearing page plus a metapage, we had 100% relative error.
This led to surprising planning choices such as not using a small
partial index.

To fix, ask genericcostestimate's caller to supply an estimate of
the number of non-leaf pages, and subtract that.  For the built-in
index AMs, it seems sufficient to count the index metapage (if the
AM uses one) as non-leaf.  Per the above argument, counting upper
index pages shouldn't change the estimate much, and in most cases
we don't have any easy way of estimating the number of upper pages.
This might be an area for further research in future.

Any external genericcostestimate callers that do not set the new field
GenericCosts.numNonLeafPages will see the same behavior as before,
assuming they followed the advice to zero out that whole struct.

Unsurprisingly, this change affects a number of plans seen in the
core regression tests.  I hacked up the existing tests to keep the
tests' plans the same, since in each case it appeared that the
test's intent was to test exactly that plan.  Also add one new
test case demonstrating that a better index choice is now made.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Henson Choi <assam258@gmail.com>
Discussion: https://postgr.es/m/870521.1745860752@sss.pgh.pa.us
2026-03-20 14:50:53 -04:00
..
expected Remove incidental md5() function uses from several tests 2023-07-04 14:31:57 +02:00
sql Remove incidental md5() function uses from several tests 2023-07-04 14:31:57 +02:00
t Update copyright for 2026 2026-01-01 13:24:10 -05:00
.gitignore Bloom index contrib module 2016-04-01 16:42:24 +03:00
blcost.c Discount the metapage when estimating number of index pages visited. 2026-03-20 14:50:53 -04:00
blinsert.c Update copyright for 2026 2026-01-01 13:24:10 -05:00
bloom--1.0.sql Minor fixes in contrib installation scripts. 2016-06-14 10:47:06 -04:00
bloom.control Bloom index contrib module 2016-04-01 16:42:24 +03:00
bloom.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
blscan.c bloom: Optimize bitmap scan path with streaming read 2026-03-11 07:36:10 +09:00
blutils.c Update copyright for 2026 2026-01-01 13:24:10 -05:00
blvacuum.c bloom: Optimize VACUUM and bulk-deletion with streaming read 2026-03-12 12:00:22 +09:00
blvalidate.c Update copyright for 2026 2026-01-01 13:24:10 -05:00
Makefile Re-enable contrib/bloom's TAP tests. 2021-09-27 18:48:01 -04:00
meson.build Update copyright for 2026 2026-01-01 13:24:10 -05:00