Base de données relationnelle
Find a file
Tom Lane e6a1d8f5ac Fix estimate_hash_bucket_stats's correction for skewed data.
The previous idea was "scale up the bucketsize estimate by the ratio
of the MCV's frequency to the average value's frequency".  But we
should have been suspicious of that plan, since it frequently led to
impossible (> 1) values which we had to apply an ad-hoc clamp to.
Joel Jacobson demonstrated that it sometimes leads to making the
wrong choice about which side of the hash join should be inner.

Instead, drop the whole business of estimating average frequency, and
just clamp the bucketsize estimate to be at least the MCV's frequency.
This corresponds to the bucket size we'd get if only the MCV appears
in a bucket, and the MCV's frequency is not affected by the
WHERE-clause filters.  (We were already making the latter assumption.)
This also matches the coding used since 4867d7f62 in the case where
only a default ndistinct estimate is available.

Interestingly, this change affects no existing regression test cases.
Add one to demonstrate that it helps pick the smaller table to be
hashed when the MCV is common enough to affect the results.

This leaves estimate_hash_bucket_stats not considering the effects of
null join keys at all, which we should probably improve.  However,
I have a different patch in the queue that will change the executor's
handling of null join keys, so it seems appropriate to wait till
that's in before doing anything more here.

Reported-by: Joel Jacobson <joel@compiler.org>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Joel Jacobson <joel@compiler.org>
Discussion: https://postgr.es/m/341b723c-da45-4058-9446-1514dedb17c1@app.fastmail.com
2026-03-04 15:33:15 -05:00
.github Add CODE_OF_CONDUCT.md, CONTRIBUTING.md, and SECURITY.md. 2024-07-02 13:03:58 -05:00
config Support using copyObject in standard C++ 2026-03-02 11:48:13 +01:00
contrib Fix local-variable shadowing in pg_trgm's printSourceNFA(). 2026-03-02 14:40:29 -05:00
doc Allow table exclusions in publications via EXCEPT TABLE. 2026-03-04 15:56:48 +05:30
src Fix estimate_hash_bucket_stats's correction for skewed data. 2026-03-04 15:33:15 -05:00
.cirrus.star ci: Simplify ci-os-only handling 2025-08-14 12:09:34 -04:00
.cirrus.tasks.yml Change default value of default_toast_compression to "lz4", when available 2026-03-04 13:05:31 +09:00
.cirrus.yml ci: Per-repo configuration for manually trigger tasks 2025-08-14 11:54:03 -04:00
.dir-locals.el Make Emacs perl-mode indent more like perltidy. 2019-01-13 11:32:31 -08:00
.editorconfig Update .editorconfig and .gitattributes for postgresql.conf.sample. 2025-11-18 10:28:36 -06:00
.git-blame-ignore-revs Add commit 7b24959434 to .git-blame-ignore-revs. 2026-03-02 13:23:28 -06:00
.gitattributes Update .editorconfig and .gitattributes for postgresql.conf.sample. 2025-11-18 10:28:36 -06:00
.gitignore Update top-level .gitignore. 2022-12-04 15:23:00 -05:00
.mailmap Add a Git .mailmap file 2024-11-05 13:56:02 +01:00
aclocal.m4 autoconf: Move export_dynamic determination to configure 2022-12-06 18:55:28 -08:00
configure Change default value of default_toast_compression to "lz4", when available 2026-03-04 13:05:31 +09:00
configure.ac Change default value of default_toast_compression to "lz4", when available 2026-03-04 13:05:31 +09:00
COPYRIGHT Update copyright for 2026 2026-01-01 13:24:10 -05:00
GNUmakefile.in Allow selecting the git revision to be packaged by "make dist". 2024-05-03 11:08:50 -04:00
HISTORY Canonicalize some URLs 2020-02-10 20:47:50 +01:00
Makefile Restore AIX support. 2026-02-23 13:34:22 -05:00
meson.build Support using copyObject in standard C++ 2026-03-02 11:48:13 +01:00
meson_options.txt Update copyright for 2026 2026-01-01 13:24:10 -05:00
README.md Revise the style of a paragraph in README.md. 2024-03-21 10:16:41 -05:00

PostgreSQL Database Management System

This directory contains the source code distribution of the PostgreSQL database management system.

PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions. This distribution also contains C language bindings.

Copyright and license information can be found in the file COPYRIGHT.

General documentation about this version of PostgreSQL can be found at https://www.postgresql.org/docs/devel/. In particular, information about building PostgreSQL from the source code can be found at https://www.postgresql.org/docs/devel/installation.html.

The latest version of this software, and related software, may be obtained at https://www.postgresql.org/download/. For more information look at our web site located at https://www.postgresql.org/.