Commit graph

63324 commits

Author SHA1 Message Date
Noah Misch
8cef93d8a5 Suppress new "may be used uninitialized" warning.
Various buildfarm members, having compilers like gcc 8.5 and 6.3, fail
to deduce that text_substring() variable "E" is initialized if
slice_size!=-1.  This suppression approach quiets gcc 8.5; I did not
reproduce the warning elsewhere.  Back-patch to v14, like commit
9f4fd119b2.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1157953.1771266105@sss.pgh.pa.us
Backpatch-through: 14
2026-02-16 18:04:58 -08:00
Michael Paquier
a6f823e778 hstore: Fix NULL pointer dereference with receive function
The receive function of hstore was not able to handle correctly
duplicate key values when a new duplicate links to a NULL value, where a
pfree() could be attempted on a NULL pointer, crashing due to a pointer
dereference.

This problem would happen for a COPY BINARY, when stacking values like
that:
aa => 5
aa => null

The second key/value pair is discarded and pfree() calls are attempted
on its key and its value, leading to a pointer dereference for the value
part as the value is NULL.  The first key/value pair takes priority when
a duplicate is found.

Per offline report.

Reported-by: "Anemone" <vergissmeinnichtzh@gmail.com>
Reported-by: "A1ex" <alex000young@gmail.com>
Backpatch-through: 14
2026-02-17 08:41:26 +09:00
Nathan Bossart
b33f753612 pg_upgrade: Use COPY for LO metadata for upgrades from < v12.
Before v12, pg_largeobject_metadata was defined WITH OIDS, so
unlike newer versions, the "oid" column was a hidden system column
that pg_dump's getTableAttrs() will not pick up.  Thus, for commit
161a3e8b68, we did not bother trying to use COPY for
pg_largeobject_metadata for upgrades from older versions.  This
commit removes that restriction by adjusting the query in
getTableAttrs() to pick up the "oid" system column and by teaching
dumpTableData_copy() to use COPY (SELECT ...) for this catalog,
since system columns cannot be used in COPY's column list.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/aYzuAz_ITUpd9ZvH%40nathan
2026-02-16 15:13:06 -06:00
Tom Lane
6be5b76d66 Ensure that all three build methods install the same set of files.
syscache_info.h was installed into $installdir/include/server/catalog
if you use a non-VPATH autoconf build, but not if you use a VPATH
build or meson.  That happened because the makefiles blindly install
src/include/catalog/*.h, and in a non-VPATH build the generated
header files would be swept up in that.  While it's hard to conjure
a reason to need syscache_info.h outside of backend build, it's
also hard to get the makefiles to skip syscache_info.h, so let's
go the other way and install it in the other two cases too.

Another problem, new in v19, was that meson builds install a copy of
src/include/catalog/README, while autoconf builds do not.  The issue
here is that that file is new and wasn't added to meson.build's
exclusion list.

While it's clearly a bug if different build methods don't install
the same set of files, I doubt anyone would thank us for changing
the behavior in released branches.  Hence, fix in master only.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/946828.1771185367@sss.pgh.pa.us
2026-02-16 15:20:15 -05:00
Daniel Gustafsson
db93988ab0 doc: Add note to ssl_group config on X25519 and FIPS
The X25519 curve is not allowed when OpenSSL is configured for
FIPS mode, so add a note to the documentation that the default
setting must be altered for such setups.

Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3521653.1770666093@sss.pgh.pa.us
2026-02-16 15:11:29 +01:00
Daniel Gustafsson
07e90c6913 Avoid using the X25519 curve in ssl tests
The X25519 curve is disallowed when OpenSSL is configured for
FIPS mode which makes the testsuite fail.  Since X25519 isn't
required for the tests we can remove it to allow FIPS enabled
configurations to run the tests.

Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3521653.1770666093@sss.pgh.pa.us
2026-02-16 15:10:16 +01:00
Peter Eisentraut
d50c86e743 Change remaining StaticAssertStmt() to StaticAssertDecl()
This completes the work started by commit 75f49221c2.

In basebackup.c, changing the StaticAssertStmt to StaticAssertDecl
results in having the same StaticAssertDecl() in 2 functions.  So, it
makes more sense to move it to file scope instead.

Also, as it depends on some computations based on 2 tar blocks, define
TAR_NUM_TERMINATION_BLOCKS.

In deadlock.c, change the StaticAssertStmt to StaticAssertDecl and
keep it in the function scope.  Add new braces to avoid warning from
-Wdeclaration-after-statement.

In aset.c, change the StaticAssertStmt to StaticAssertDecl and move it
to file scope.

Finally, update the comments in c.h a bit.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/aYH6ii46AvGVCB84%40ip-10-97-1-34.eu-west-3.compute.internal
2026-02-16 09:22:43 +01:00
Fujii Masao
351265a6c7 Remove recovery.signal at recovery end when both signal files are present.
When both standby.signal and recovery.signal are present, standby.signal
takes precedence and the server runs in standby mode. Previously,
in this case, recovery.signal was not removed at the end of standby mode
(i.e., on promotion) or at the end of archive recovery, while standby.signal
was removed. As a result, a leftover recovery.signal could cause
a subsequent restart to enter archive recovery unexpectedly, potentially
preventing the server from starting. This behavior was surprising and
confusing to users.

This commit fixes the issue by updating the recovery code to remove
recovery.signal alongside standby.signal when both files are present and
recovery completes.

Because this code path is particularly sensitive and changes in recovery
behavior can be risky for stable branches, this change is applied only to
the master branch.

Reported-by: Nikolay Samokhvalov <nik@postgres.ai>
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: David Steele <david@pgbackrest.org>
Discussion: https://postgr.es/m/CAM527d8PVAQFLt_ndTXE19F-XpDZui861882L0rLY3YihQB8qA@mail.gmail.com
2026-02-16 13:57:38 +09:00
Michael Paquier
459576303d pgcrypto: Tweak error message for incorrect session key length
The error message added in 379695d3cc referred to the public key being
too long.  This is confusing as it is in fact the session key included
in a PGP message which is too long.  This is harmless, but let's be
precise about what is wrong.

Per offline report.

Reported-by: Zsolt Parragi <zsolt.parragi@percona.com>
Backpatch-through: 14
2026-02-16 12:18:18 +09:00
Noah Misch
9f4fd119b2 Fix SUBSTRING() for toasted multibyte characters.
Commit 1e7fe06c10 changed
pg_mbstrlen_with_len() to ereport(ERROR) if the input ends in an
incomplete character.  Most callers want that.  text_substring() does
not.  It detoasts the most bytes it could possibly need to get the
requested number of characters.  For example, to extract up to 2 chars
from UTF8, it needs to detoast 8 bytes.  In a string of 3-byte UTF8
chars, 8 bytes spans 2 complete chars and 1 partial char.

Fix this by replacing this pg_mbstrlen_with_len() call with a string
traversal that differs by stopping upon finding as many chars as the
substring could need.  This also makes SUBSTRING() stop raising an
encoding error if the incomplete char is past the end of the substring.
This is consistent with the general philosophy of the above commit,
which was to raise errors on a just-in-time basis.  Before the above
commit, SUBSTRING() never raised an encoding error.

SUBSTRING() has long been detoasting enough for one more char than
needed, because it did not distinguish exclusive and inclusive end
position.  For avoidance of doubt, stop detoasting extra.

Back-patch to v14, like the above commit.  For applications using
SUBSTRING() on non-ASCII column values, consider applying this to your
copy of any of the February 12, 2026 releases.

Reported-by: SATŌ Kentarō <ranvis@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Bug: #19406
Discussion: https://postgr.es/m/19406-9867fddddd724fca@postgresql.org
Backpatch-through: 14
2026-02-14 12:16:16 -08:00
Noah Misch
4644f8b23b pg_mblen_range, pg_mblen_with_len: Valgrind after encoding ereport.
The prior order caused spurious Valgrind errors.  They're spurious
because the ereport(ERROR) non-local exit discards the pointer in
question.  pg_mblen_cstr() ordered the checks correctly, but these other
two did not.  Back-patch to v14, like commit
1e7fe06c10.

Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20260214053821.fa.noahmisch@microsoft.com
Backpatch-through: 14
2026-02-14 12:16:16 -08:00
John Naylor
ef3c3cf6d0 Perform radix sort on SortTuples with pass-by-value Datums
Radix sort can be much faster than quicksort, but for our purposes it
is limited to sequences of unsigned bytes. To make tuples with other
types amenable to this technique, several features of tuple comparison
must be accounted for, i.e. the sort key must be "normalized":

1. Signedness -- It's possible to modify a signed integer such that
it can be compared as unsigned. For example, a signed char has range
-128 to 127. If we cast that to unsigned char and add 128, the range
of values becomes 0 to 255 while preserving order.

2. Direction -- SQL allows specification of ASC or DESC. The
descending case is easily handled by taking the complement of the
unsigned representation.

3. NULL values -- NULLS FIRST and NULLS LAST must work correctly.

This commmit only handles the case where datum1 is pass-by-value
Datum (possibly abbreviated) that compares like an ordinary
integer. (Abbreviations of values of type "numeric" are a convenient
counterexample.) First, tuples are partitioned by nullness in the
correct NULL ordering. Then the NOT NULL tuples are sorted with radix
sort on datum1. For tiebreaks on subsequent sortkeys (including the
first sort key if abbreviated), we divert to the usual qsort.

ORDER BY queries on pre-warmed buffers are up to 2x faster on high
cardinality inputs with radix sort than the sort specializations added
by commit 697492434, so get rid of them. It's sufficient to fall back
to qsort_tuple() for small arrays. Moderately low cardinality inputs
show more modest improvents. Our qsort is strongly optimized for very
low cardinality inputs, but radix sort is usually equal or very close
in those cases.

The changes to the regression tests are caused by under-specified sort
orders, e.g. "SELECT a, b from mytable order by a;". For unstable
sorts, such as our qsort and this in-place radix sort, there is no
guarantee of the order of "b" within each group of "a".

The implementation is taken from ska_byte_sort() (Boost licensed),
which is similar to American flag sort (an in-place radix sort) with
modifications to make it better suited for modern pipelined CPUs.

The technique of normalization described above can also be extended
to the case of multiple keys. That is left for future work (Thanks
to Peter Geoghegan for the suggestion to look into this area).

Reviewed-by: Chengpeng Yan <chengpeng_yan@outlook.com>
Reviewed-by: zengman <zengman@halodbtech.com>
Reviewed-by: ChangAo Chen <cca5507@qq.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Chao Li <li.evan.chao@gmail.com> (earlier version)
Discussion: https://postgr.es/m/CANWCAZYzx7a7E9AY16Jt_U3+GVKDADfgApZ-42SYNiig8dTnFA@mail.gmail.com
2026-02-14 13:50:06 +07:00
Daniel Gustafsson
aa082bed0b doc: Mention PASSING support for jsonpath variables
Commit dfd79e2d added a TODO comment to update this paragraph
when support for PASSING was added.  Commit 6185c9737c added
PASSING but missed resolving this TODO.  Fix by expanding the
paragraph with a reference to PASSING.

Author: Aditya Gollamudi <adigollamudi@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/20260117051406.sx6pss4ryirn2x4v@pgs
2026-02-13 12:12:11 +01:00
Daniel Gustafsson
4469fe1761 doc: Update docs images README with required ditaa version
The URL for Ditaa linked to the old Sourceforge version which is
too old for what we need, the fork over on Github is the correct
version to use for re-generating the SVG files for the docs. The
required Ditaa version is 0.11.0 as it when SVG support as added.
Running the version found on Sourceforge produce the error below:

$ ditaa -E -S --svg in.txt out.txt
Unrecognized option: --svg
usage: ditaa <INPFILE> [OUTFILE] [-A] [-b <BACKGROUND>] [-d] [-E] [-e
      <ENCODING>] [-h] [--help] [-o] [-r] [-S] [-s <SCALE>] [-T] [-t
      <TABS>] [-v] [-W]

While there, also mention that meson rules exists for building
images.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://postgr.es/m/CAN55FZ2O-23xERF2NYcvv9DM_1c9T16y6mi3vyP=O1iuXS0ASA@mail.gmail.com
2026-02-13 11:50:17 +01:00
Daniel Gustafsson
4ec0e75afd meson: Add target for generating docs images
This adds an 'images' target to the meson build system in order to
be able to regenerate the images used in the docs.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAN55FZ0c0Tcjx9=e-YibWGHa1-xmdV63p=THH4YYznz+pYcfig@mail.gmail.com
2026-02-13 11:50:14 +01:00
Michael Paquier
6736dea14a pg_dump: Use pg_malloc_object() and pg_malloc_array()
The idea is to encourage more the use of these allocation routines
across the tree, as these offer stronger type safety guarantees than
pg_malloc() & co (type cast in the result, sizeof() embedded).  This set
of changes is dedicated to the pg_dump code.

Similar work has been done as of 31d3847a37, as one example.

Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com>
Discussion: https://postgr.es/m/CAHut+PvpGPDLhkHAoxw_g3jdrYxA1m16a8uagbgH3TGWSKtXNQ@mail.gmail.com
2026-02-13 19:48:35 +09:00
Daniel Gustafsson
53c6bd0aa3 Restart BackgroundPsql's timer more nicely.
Use BackgroundPsql's published API for automatically restarting
its timer for each query, rather than manually reaching into it
to achieve the same thing.

010_tab_completion.pl's logic for this predates the invention
of BackgroundPsql (and 664d75753 missed the opportunity to
make it cleaner).  030_pager.pl copied-and-pasted the code.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1100715.1712265845@sss.pgh.pa.us
2026-02-13 11:36:31 +01:00
Michael Paquier
775fc01415 Improve error message for checksum failures in pgstat_database.c
This log message was referring to conflicts, but it is about checksum
failures.  The log message improved in this commit should never show up,
due to the fact that pgstat_prepare_report_checksum_failure() should
always be called before pgstat_report_checksum_failures_in_db(), with a
stats entry already created in the pgstats shared hash table.  The three
code paths able to report database-level checksum failures follow
already this requirement.

Oversight in b96d3c3897.

Author: Wang Peng <215722532@qq.com>
Discussion: https://postgr.es/m/tencent_9B6CD6D9D34AE28CDEADEC6188DB3BA1FE07@qq.com
Backpatch-through: 18
2026-02-13 12:17:08 +09:00
Heikki Linnakangas
d7edcec35c Make pg_numa_query_pages() work in frontend programs
It's currently only used in the server, but it was placed in src/port
with the idea that it might be useful in client programs too. However,
it will currently fail to link if used in a client program, because
CHECK_FOR_INTERRUPTS() is not usable in client programs. Fix that by
wrapping it in "#ifndef FRONTEND".

Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://www.postgresql.org/message-id/21cc7a48-99d9-4f69-9a3f-2c2de61ac8e5%40iki.fi
Backpatch-through: 18
2026-02-12 19:41:06 +02:00
Heikki Linnakangas
d7a4291bb7 Fix comment neglected in commit ddc3250208
I renamed the field in commit ddc3250208, but missed this one
reference.
2026-02-12 19:41:02 +02:00
Nathan Bossart
a468898883 Remove specialized word-length popcount implementations.
The uses of these functions do not justify the level of
micro-optimization we've done and may even hurt performance in some
cases (e.g., due to using function pointers).  This commit removes
all architecture-specific implementations of pg_popcount{32,64} and
converts the portable ones to inlined functions in pg_bitutils.h.
These inlined versions should produce the same code as before (but
inlined), so in theory this is a net gain for many machines.  A
follow-up commit will replace the remaining loops over these
word-length popcount functions with calls to pg_popcount(), further
reducing the need for architecture-specific implementations.

Suggested-by: John Naylor <johncnaylorls@gmail.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Reviewed-by: Greg Burd <greg@burd.me>
Discussion: https://postgr.es/m/CANWCAZY7R%2Biy%2Br9YM_sySNydHzNqUirx1xk0tB3ej5HO62GdgQ%40mail.gmail.com
2026-02-12 11:32:49 -06:00
Nathan Bossart
cb7b2e5e8e Remove some unnecessary optimizations in popcount code.
Over the past few releases, we've added a huge amount of complexity
to our popcount implementations.  Commits fbe327e5b4, 79e232ca01,
8c6653516c, and 25dc485074 did some preliminary refactoring, but
many opportunities remain.  In particular, if we disclaim interest
in micro-optimizing this code for 32-bit builds and in unnecessary
alignment checks on x86-64, we can remove a decent chunk of code.
I cannot find public discussion or benchmarks for the code this
commit removes,  but it seems unlikely that this change will
noticeably impact performance on affected systems.

Suggested-by: John Naylor <johncnaylorls@gmail.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/CANWCAZY7R%2Biy%2Br9YM_sySNydHzNqUirx1xk0tB3ej5HO62GdgQ%40mail.gmail.com
2026-02-12 11:32:49 -06:00
Dean Rasheed
88327092ff Add support for INSERT ... ON CONFLICT DO SELECT.
This adds a new ON CONFLICT action DO SELECT [FOR UPDATE/SHARE], which
returns the pre-existing rows when conflicts are detected. The INSERT
statement must have a RETURNING clause, when DO SELECT is specified.

The optional FOR UPDATE/SHARE clause allows the rows to be locked
before they are are returned. As with a DO UPDATE conflict action, an
optional WHERE clause may be used to prevent rows from being selected
for return (but as with a DO UPDATE action, rows filtered out by the
WHERE clause are still locked).

Bumps catversion as stored rules change.

Author: Andreas Karlsson <andreas@proxel.se>
Author: Marko Tiikkaja <marko@joh.to>
Author: Viktor Holmberg <v@viktorh.net>
Reviewed-by: Joel Jacobson <joel@compiler.org>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/d631b406-13b7-433e-8c0b-c6040c4b4663@Spark
Discussion: https://postgr.es/m/5fca222d-62ae-4a2f-9fcb-0eca56277094@Spark
Discussion: https://postgr.es/m/2b5db2e6-8ece-44d0-9890-f256fdca9f7e@proxel.se
Discussion: https://postgr.es/m/CAL9smLCdV-v3KgOJX3mU19FYK82N7yzqJj2HAwWX70E=P98kgQ@mail.gmail.com
2026-02-12 09:57:04 +00:00
Amit Kapila
788ec96d59 Refactor slot synchronization logic in slotsync.c.
Following e68b6adad9, the reason for skipping slot synchronization is
stored as a slot property. This commit removes redundant function
parameters that previously tracked this state, instead relying directly on
the slot property.

Additionally, this change centralizes the logic for skipping
synchronization when required WAL has not yet been received or flushed. By
consolidating this check, we reduce code duplication and the risk of
inconsistent state updates across different code paths.

In passing, add an assertion to ensure a slot is marked as temporary if a
consistent point has not been reached during synchronization.

Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/TY4PR01MB16907DD16098BE3B20486D4569463A@TY4PR01MB16907.jpnprd01.prod.outlook.com
Discussion: https://postgr.es/m/CAFPTHDZAA+gWDntpa5ucqKKba41=tXmoXqN3q4rpjO9cdxgQrw@mail.gmail.com
2026-02-12 14:38:31 +05:30
Dean Rasheed
706cadde32 Remove p_is_insert from struct ParseState.
The only place that used p_is_insert was transformAssignedExpr(),
which used it to distinguish INSERT from UPDATE when handling
indirection on assignment target columns -- see commit c1ca3a19df.
However, this information is already available to
transformAssignedExpr() via its exprKind parameter, which is always
either EXPR_KIND_INSERT_TARGET or EXPR_KIND_UPDATE_TARGET.

As noted in the commit message for c1ca3a19df, this use of
p_is_insert isn't particularly pretty, so have transformAssignedExpr()
use the exprKind parameter instead. This then allows p_is_insert to be
removed entirely, which simplifies state management in a few other
places across the parser.

Author: Viktor Holmberg <v@viktorh.net>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/badc3b4c-da73-4000-b8d3-638a6f53a769@Spark
2026-02-12 09:01:42 +00:00
Richard Guo
cf74558feb Reduce LEFT JOIN to ANTI JOIN using NOT NULL constraints
For a LEFT JOIN, if any var from the right-hand side (RHS) is forced
to null by upper-level quals but is known to be non-null for any
matching row, the only way the upper quals can be satisfied is if the
join fails to match, producing a null-extended row.  Thus, we can
treat this left join as an anti-join.

Previously, this transformation was limited to cases where the join's
own quals were strict for the var forced to null by upper qual levels.
This patch extends the logic to check table constraints, leveraging
the NOT NULL attribute information already available thanks to the
infrastructure introduced by e2debb643.  If a forced-null var belongs
to the RHS and is defined as NOT NULL in the schema (and is not
nullable due to lower-level outer joins), we know that the left join
can be reduced to an anti-join.

Note that to ensure the var is not nullable by any lower-level outer
joins within the current subtree, we collect the relids of base rels
that are nullable within each subtree during the first pass of the
reduce-outer-joins process.  This allows us to verify in the second
pass that a NOT NULL var is indeed safe to treat as non-nullable.

Based on a proposal by Nicolas Adenis-Lamarre, but this is not the
original patch.

Suggested-by: Nicolas Adenis-Lamarre <nicolas.adenis.lamarre@gmail.com>
Author: Tender Wang <tndrwang@gmail.com>
Co-authored-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CACPGbctKMDP50PpRH09in+oWbHtZdahWSroRstLPOoSDKwoFsw@mail.gmail.com
2026-02-12 15:30:13 +09:00
Tom Lane
9863c90759 Fix plpgsql's handling of "return simple_record_variable".
If the variable's value is null, exec_stmt_return() missed filling
in estate->rettype.  This is a pretty old bug, but we'd managed not
to notice because that value isn't consulted for a null result ...
unless we have to cast it to a domain.  That case led to a failure
with "cache lookup failed for type 0".

The correct way to assign the data type is known by exec_eval_datum.
While we could copy-and-paste that logic, it seems like a better
idea to just invoke exec_eval_datum, as the ROW case already does.

Reported-by: Pavel Stehule <pavel.stehule@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAFj8pRBT_ahexDf-zT-cyH8bMR_qcySKM8D5nv5MvTWPiatYGA@mail.gmail.com
Backpatch-through: 14
2026-02-11 16:53:14 -05:00
Heikki Linnakangas
78a5e3074b Fix pg_stat_get_backend_wait_event() for aux processes
The pg_stat_activity view shows information for aux processes, but the
pg_stat_get_backend_wait_event() and
pg_stat_get_backend_wait_event_type() functions did not. To fix, call
AuxiliaryPidGetProc(pid) if BackendPidGetProc(pid) returns NULL, like
we do in pg_stat_get_activity().

In version 17 and above, it's a little silly to use those functions
when we already have the ProcNumber at hand, but it was necessary
before v17 because the backend ID was different from ProcNumber. I
have other plans for wait_event_info on master, so it doesn't seem
worth applying a different fix on different versions now.

Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/c0320e04-6e85-4c49-80c5-27cfb3a58108@iki.fi
Backpatch-through: 14
2026-02-11 18:50:57 +02:00
Nathan Bossart
1d92e0c2cc Add password expiration warnings.
This commit adds a new parameter called
password_expiration_warning_threshold that controls when the server
begins emitting imminent-password-expiration warnings upon
successful password authentication.  By default, this parameter is
set to 7 days, but this functionality can be disabled by setting it
to 0.  This patch also introduces a new "connection warning"
infrastructure that can be reused elsewhere.  For example, we may
want to warn about the use of MD5 passwords for a couple of
releases before removing MD5 password support.

Author: Gilles Darold <gilles@darold.net>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Reviewed-by: songjinzhou <tsinghualucky912@foxmail.com>
Reviewed-by: liu xiaohui <liuxh.zj.cn@gmail.com>
Reviewed-by: Yuefei Shi <shiyuefei1004@gmail.com>
Reviewed-by: Steven Niu <niushiji@gmail.com>
Reviewed-by: Soumya S Murali <soumyamurali.work@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/129bcfbf-47a6-e58a-190a-62fc21a17d03%40migops.com
2026-02-11 10:36:15 -06:00
Tom Lane
a3fd53babb Further stabilize a postgres_fdw test case.
The buildfarm occasionally shows a variant row order in the output
of this UPDATE ... RETURNING, implying that the preceding INSERT
dropped one of the rows into some free space within the table rather
than appending them all at the end.  It's not entirely clear why that
happens some times and not other times, but we have established that
it's affected by concurrent activity in other databases of the
cluster.  In any case, the behavior is not wrong; the test is at fault
for presuming that a seqscan will give deterministic row ordering.
Add an ORDER BY atop the update to stop the buildfarm noise.

The buildfarm seems to have shown this only in v18 and master
branches, but just in case the cause is older, back-patch to
all supported branches.

Discussion: https://postgr.es/m/3866274.1770743162@sss.pgh.pa.us
Backpatch-through: 14
2026-02-11 11:03:17 -05:00
Álvaro Herrera
1efdd7cc63
Cleanup for log_min_messages changes in 38e0190ced
* Remove an unused variable
* Use "default log level" consistently (instead of "generic")
* Keep the process types in alphabetical order (missed one place in the
  SGML docs)
* Since log_min_messages type was changed from enum to string, it
  is a good idea to add single quotes when printing it out.  Otherwise
  it fails if the user copies and pastes from the SHOW output to SET,
  except in the simplest case.  Using single quotes reduces confusion.
* Use lowercase string for the burned-in default value, to keep the same
  output as previous versions.

Author: Euler Taveira <euler@eulerto.com>
Author: Man Zeng <zengman@halodbtech.com>
Author: Noriyoshi Shinoda <noriyoshi.shinoda@hpe.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/202602091250.genyflm2d5dw@alvherre.pgsql
2026-02-11 16:38:18 +01:00
Heikki Linnakangas
7984ce7a1d Move ProcStructLock to the ProcGlobal struct
It protects the freeProcs and some other fields in ProcGlobal, so
let's move it there. It's good for cache locality to have it next to
the thing it protects, and just makes more sense anyway. I believe it
was allocated as a separate shared memory area just for historical
reasons.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/b78719db-0c54-409f-b185-b0d59261143f@iki.fi
2026-02-11 16:48:45 +02:00
Dean Rasheed
bc953bf523 doc: Mention all SELECT privileges required by INSERT ... ON CONFLICT.
On the INSERT page, mention that SELECT privileges are also required
for any columns mentioned in the arbiter clause, including those
referred to by the constraint, and clarify that this applies to all
forms of ON CONFLICT, not just ON CONFLICT DO UPDATE.

Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Viktor Holmberg <v@viktorh.net>
Discussion: https://postgr.es/m/CAEZATCXGwMQ+x00YY9XYG46T0kCajH=21QaYL9Xatz0dLKii+g@mail.gmail.com
Backpatch-through: 14
2026-02-11 10:52:58 +00:00
Dean Rasheed
227a6ea657 doc: Clarify RLS policies applied for ON CONFLICT DO NOTHING.
On the CREATE POLICY page, the description of per-command policies
stated that SELECT policies are applied when an INSERT has an ON
CONFLICT DO NOTHING clause. However, that is only the case if it
includes an arbiter clause, so clarify that.

While at it, also clarify the comment in the regression tests that
cover this.

Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Viktor Holmberg <v@viktorh.net>
Discussion: https://postgr.es/m/CAEZATCXGwMQ+x00YY9XYG46T0kCajH=21QaYL9Xatz0dLKii+g@mail.gmail.com
Backpatch-through: 14
2026-02-11 10:25:05 +00:00
Heikki Linnakangas
ab32a9e21d Remove useless store to local variable
It was a leftover from commit 5764f611e1, which converted the loop to
use dclist_foreach.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://www.postgresql.org/message-id/3dd6f70c-b94d-4428-8e75-74a7136396be@iki.fi
2026-02-11 11:49:18 +02:00
Robert Haas
7358abcc60 Store information about Append node consolidation in the final plan.
An extension (or core code) might want to reconstruct the planner's
decisions about whether and where to perform partitionwise joins from
the final plan. To do so, it must be possible to find all of the RTIs
of partitioned tables appearing in the plan. But when an AppendPath
or MergeAppendPath pulls up child paths from a subordinate AppendPath
or MergeAppendPath, the RTIs of the subordinate path do not appear
in the final plan, making this kind of reconstruction impossible.

To avoid this, propagate the RTI sets that would have been present
in the 'apprelids' field of the subordinate Append or MergeAppend
nodes that would have been created into the surviving Append or
MergeAppend node, using a new 'child_append_relid_sets' field for
that purpose. The value of this field is a list of Bitmapsets,
because each relation whose append-list was pulled up had its own
set of RTIs: just one, if it was a partitionwise scan, or more than
one, if it was a partitionwise join. Since our goal is to see where
partitionwise joins were done, it is essential to avoid losing the
information about how the RTIs were grouped in the pulled-up
relations.

This commit also updates pg_overexplain so that EXPLAIN (RANGE_TABLE)
will display the saved RTI sets.

Co-authored-by: Robert Haas <rhaas@postgresql.org>
Co-authored-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Greg Burd <greg@burd.me>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Haibo Yan <tristan.yim@gmail.com>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Discussion: http://postgr.es/m/CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com
2026-02-10 17:55:59 -05:00
Michael Paquier
9181c870ba Improve type handling of varlena structures
This commit changes the definition of varlena to a typedef, so as it
becomes possible to remove "struct" markers from various declarations in
the code base.  Historically, "struct" markers are not the project style
for variable declarations, so this update simplifies the code and makes
it more consistent across the board.

This change has an impact on the following structures, simplifying
declarations using them:
- varlena
- varatt_indirect
- varatt_external

This cleanup has come up in a different path set that played with
TOAST and varatt.h, independently worth doing on its own.

Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aW8xvVbovdhyI4yo@paquier.xyz
2026-02-11 07:33:24 +09:00
Robert Haas
0d4391b265 Store information about elided nodes in the final plan.
An extension (or core code) might want to reconstruct the planner's
choice of join order from the final plan. To do so, it must be possible
to find all of the RTIs that were part of the join problem in that plan.
Commit adbad833f3, together with the
earlier work in 8c49a484e8, is enough to
let us match up RTIs we see in the final plan with RTIs that we see
during the planning cycle, but we still have a problem if the planner
decides to drop some RTIs out of the final plan altogether.

To fix that, when setrefs.c removes a SubqueryScan, single-child Append,
or single-child MergeAppend from the final Plan tree, record the type of
the removed node and the RTIs that the removed node would have scanned
in the final plan tree. It would be natural to record this information
on the child of the removed plan node, but that would require adding an
additional pointer field to type Plan, which seems undesirable.  So,
instead, store the information in a separate list that the executor need
never consult, and use the plan_node_id to identify the plan node with
which the removed node is logically associated.

Also, update pg_overexplain to display these details.

Reviewed-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Greg Burd <greg@burd.me>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Haibo Yan <tristan.yim@gmail.com>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Discussion: http://postgr.es/m/CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com
2026-02-10 16:46:05 -05:00
Robert Haas
adbad833f3 Store information about range-table flattening in the final plan.
Suppose that we're currently planning a query and, when that same
query was previously planned and executed, we learned something about
how a certain table within that query should be planned. We want to
take note when that same table is being planned during the current
planning cycle, but this is difficult to do, because the RTI of the
table from the previous plan won't necessarily be equal to the RTI
that we see during the current planning cycle. This is because each
subquery has a separate range table during planning, but these are
flattened into one range table when constructing the final plan,
changing RTIs.

Commit 8c49a484e8 allows us to match up
subqueries seen in the previous planning cycles with the subqueries
currently being planned just by comparing textual names, but that's
not quite enough to let us deduce anything about individual tables,
because we don't know where each subquery's range table appears in
the final, flattened range table.

To fix that, store a list of SubPlanRTInfo objects in the final
planned statement, each including the name of the subplan, the offset
at which it begins in the flattened range table, and whether or not
it was a dummy subplan -- if it was, some RTIs may have been dropped
from the final range table, but also there's no need to control how
a dummy subquery gets planned. The toplevel subquery has no name and
always begins at rtoffset 0, so we make no entry for it.

This commit teaches pg_overexplain's RANGE_TABLE option to make use
of this new data to display the subquery name for each range table
entry.

Reviewed-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Greg Burd <greg@burd.me>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Haibo Yan <tristan.yim@gmail.com>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Discussion: http://postgr.es/m/CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com
2026-02-10 15:33:39 -05:00
Robert Haas
0f4c8d33d4 Pass cursorOptions to planner_setup_hook.
Commit 94f3ad3961 failed to do this
because I couldn't think of a use for the information, but this has
proven to be short-sighted. Best to fix it before this code is
officially released.

Now, the only argument to standard_planenr that isn't passed to
planner_setup_hook is boundParams, but that is accessible via
glob->boundParams, and so doesn't need to be passed separately.

Discussion: https://www.postgresql.org/message-id/CA+TgmoYS4ZCVAF2jTce=bMP0Oq_db_srocR4cZyO0OBp9oUoGg@mail.gmail.com
2026-02-10 11:50:28 -05:00
Robert Haas
cbdf93d471 Fix PGS_CONSIDER_NONPARTIAL interaction with Materialize nodes.
Commit 4020b370f2 had the idea that it
would be a good idea to handle testing PGS_CONSIDER_NONPARTIAL within
cost_material to save callers the trouble, but that turns out not to be
a very good idea. One concern is that it makes cost_material() dependent
on the caller having initialized certain fields in the MaterialPath,
which is a bit awkward for materialize_finished_plan, which wants to use
a dummy path.

Another problem is that it can result in generated materialized nested
loops where the Materialize node is disabled, contrary to the intention
of joinpath.c's logic in match_unsorted_outer() and
consider_parallel_nestloop(), which aims to consider such paths only
when they would not need to be disabled. In the previous coding, it was
possible for the pgs_mask on the joinrel to have PGS_CONSIDER_NONPARTIAL
set, while the inner rel had the same bit clear. In that case, we'd
generate and then disable a Materialize path.

That seems wrong, so instead, pull up the logic to test the
PGS_CONSIDER_NONPARTIAL bit into joinpath.c, restoring the historical
behavior that either we don't generate a given materialized nested loop
in the first place, or we don't disable it.

Discussion: http://postgr.es/m/CA+TgmoawzvCoZAwFS85tE5+c8vBkqgcS8ZstQ_ohjXQ9wGT9sw@mail.gmail.com
Discussion: http://postgr.es/m/CA+TgmoYS4ZCVAF2jTce=bMP0Oq_db_srocR4cZyO0OBp9oUoGg@mail.gmail.com
2026-02-10 11:49:07 -05:00
Heikki Linnakangas
be5257725d Refactor ProcessRecoveryConflictInterrupt for readability
Two changes here:

1. Introduce a separate RECOVERY_CONFLICT_BUFFERPIN_DEADLOCK flag to
indicate a suspected deadlock that involves a buffer pin. Previously
the startup process used the same flag for a deadlock involving just
regular locks, and to check for deadlocks involving the buffer
pin. The cases are handled separately in the startup process, but the
receiving backend had to deduce which one it was based on
HoldingBufferPinThatDelaysRecovery(). With a separate flag, the
receiver doesn't need to guess.

2. Rewrite the ProcessRecoveryConflictInterrupt() function to not rely
on fallthrough through the switch-statement. That was difficult to
read.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/4cc13ba1-4248-4884-b6ba-4805349e7f39@iki.fi
2026-02-10 16:23:10 +02:00
Heikki Linnakangas
17f51ea818 Separate RecoveryConflictReasons from procsignals
Share the same PROCSIG_RECOVERY_CONFLICT flag for all recovery
conflict reasons. To distinguish, have a bitmask in PGPROC to indicate
the reason(s).

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/4cc13ba1-4248-4884-b6ba-4805349e7f39@iki.fi
2026-02-10 16:23:08 +02:00
Heikki Linnakangas
ddc3250208 Use ProcNumber rather than pid in ReplicationSlot
This helps the next commit.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/4cc13ba1-4248-4884-b6ba-4805349e7f39@iki.fi
2026-02-10 16:23:05 +02:00
Michael Paquier
f33c585774 Simplify some log messages in extended_stats_funcs.c
The log messages used in this file applied too much quoting logic:
- No need for quote_identifier(), which is fine to not use in the
context of a log entry.
- The usual project style is to group the namespace and object together
in a quoted string, when mentioned in an log message.  This code quoted
the namespace name and the extended statistics object name separately,
which was confusing.

Reported-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20260210.143752.1113524465620875233.horikyota.ntt@gmail.com
2026-02-10 16:59:19 +09:00
Michael Paquier
307447e6db Add information about range type stats to pg_stats_ext_exprs
This commit adds three attributes to the system view pg_stats_ext_exprs,
whose data can exist when involving a range type in an expression:
range_length_histogram
range_empty_frac
range_bounds_histogram

These statistics fields exist since 918eee0c49, and have become
viewable in pg_stats later in bc3c8db8ae.  This puts the definition of
pg_stats_ext_exprs on par with pg_stats.

This issue has showed up during the discussion about the restore of
extended statistics for expressions, so as it becomes possible to query
the stats data to restore from the catalogs.  Having access to this data
is useful on its own, without the restore part.

Some documentation and some tests are added, written by me.  Corey has
authored the part in system_views.sql.

Bump catalog version.

Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aYmCUx9VvrKiZQLL@paquier.xyz
2026-02-10 12:36:57 +09:00
Richard Guo
f41ab51573 Teach planner to transform "x IS [NOT] DISTINCT FROM NULL" to a NullTest
In the spirit of 8d19d0e13, this patch teaches the planner about the
principle that NullTest with !argisrow is fully equivalent to SQL's IS
[NOT] DISTINCT FROM NULL.

The parser already performs this transformation for literal NULLs.
However, a DistinctExpr expression with one input evaluating to NULL
during planning (e.g., via const-folding of "1 + NULL" or parameter
substitution in custom plans) currently remains as a DistinctExpr
node.

This patch closes the gap for const-folded NULLs.  It specifically
targets the case where one input is a constant NULL and the other is a
nullable non-constant expression.  (If the other input were otherwise,
the DistinctExpr node would have already been simplified to a constant
TRUE or FALSE.)

This transformation can be beneficial because NullTest is much more
amenable to optimization than DistinctExpr, since the planner knows a
good deal about the former and next to nothing about the latter.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49BMAOWvkdSHxpUDnniqJcEcGq3_8dd_5wTR4xrQY8urA@mail.gmail.com
2026-02-10 10:19:25 +09:00
Richard Guo
0aaf0de7fe Optimize BooleanTest with non-nullable input
The BooleanTest construct (IS [NOT] TRUE/FALSE/UNKNOWN) treats a NULL
input as the logical value "unknown".  However, when the input is
proven to be non-nullable, this special handling becomes redundant.
In such cases, the construct can be simplified directly to a boolean
expression or a constant.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49BMAOWvkdSHxpUDnniqJcEcGq3_8dd_5wTR4xrQY8urA@mail.gmail.com
2026-02-10 10:18:47 +09:00
Richard Guo
0a37961254 Optimize IS DISTINCT FROM with non-nullable inputs
The IS DISTINCT FROM construct compares values acting as though NULL
were a normal data value, rather than "unknown".  Semantically, "x IS
DISTINCT FROM y" yields true if the values differ or if exactly one is
NULL, and false if they are equal or both NULL.  Unlike ordinary
comparison operators, it never returns NULL.

Previously, the planner only simplified this construct if all inputs
were constants, folding it to a constant boolean result.  This patch
extends the optimization to cases where inputs are non-constant but
proven to be non-nullable.  Specifically, "x IS DISTINCT FROM NULL"
folds to constant TRUE if "x" is known to be non-nullable.  For cases
where both inputs are guaranteed not to be NULL, the expression
becomes semantically equivalent to "x <> y", and the DistinctExpr is
converted into an inequality OpExpr.

This transformation provides several benefits.  It converts the
comparison into a standard operator, allowing the use of partial
indexes and constraint exclusion.  Furthermore, if the clause is
negated (i.e., "IS NOT DISTINCT FROM"), it simplifies to an equality
operator.  This enables the planner to generate better plans using
index scans, merge joins, hash joins, and EC-based qual deduction.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49BMAOWvkdSHxpUDnniqJcEcGq3_8dd_5wTR4xrQY8urA@mail.gmail.com
2026-02-10 10:17:45 +09:00
Nathan Bossart
158408fef8 pg_upgrade: Fix handling of pg_largeobject_metadata.
For binary upgrades from v16 or newer, pg_upgrade transfers the
files for pg_largeobject_metadata from the old cluster, as opposed
to using COPY or ordinary SQL commands to reconstruct its contents.
While this approach adds complexity, it can greatly reduce
pg_upgrade's runtime when there are many large objects.

Large objects with comments or security labels are one source of
complexity for this approach.  During pg_upgrade, schema
restoration happens before files are transferred.  Comments and
security labels are transferred in the former step, but the COMMENT
and SECURITY LABEL commands will fail if their corresponding large
objects do not exist.  To deal with this, pg_upgrade first copies
only the rows of pg_largeobject_metadata that are needed to avoid
failures.  Later, pg_upgrade overwrites those rows by replacing
pg_largeobject_metadata's files with its files in the old cluster.

Unfortunately, there's a subtle problem here.  Simply put, there's
no guarantee that pg_upgrade will overwrite all of
pg_largeobject_metadata's files on the new cluster.  For example,
the new cluster's version might more aggressively extend relations
or create visibility maps, and pg_upgrade's file transfer code is
not sophisticated enough to remove files that lack counterparts in
the old cluster.  These extra files could cause problems
post-upgrade.

More fortunately, we can simultaneously fix the aforementioned
problem and further optimize binary upgrades for clusters with many
large objects.  If we teach the COMMENT and SECURITY LABEL commands
to allow nonexistent large objects during binary upgrades,
pg_upgrade no longer needs to transfer pg_largeobject_metadata's
contents beforehand.  This approach also allows us to remove the
associated dependency tracking from pg_dump, even for upgrades from
v12-v15 that use COPY to transfer pg_largeobject_metadata's
contents.

In addition to what is described in the previous paragraph, this
commit modifies the query in getLOs() to only retrieve LOs with
comments or security labels for upgrades from v12 or newer.  We
have long assumed that such usage is rare, so this should reduce
pg_upgrade's memory usage and runtime in many cases.  We might also
be able to remove the "upgrades from v12 or newer" restriction on
the recent batch of optimizations by adding special handling for
pg_largeobject_metadata's hidden OID column on older versions
(since this catalog previously used the now-removed WITH OIDS
feature), but that is left as a future exercise.

Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/3yd2ss6n7xywo6pmhd7jjh3bqwgvx35bflzgv3ag4cnzfkik7m%40hiyadppqxx6w
2026-02-09 14:58:02 -06:00