Commit graph

62430 commits

Author SHA1 Message Date
Michael Paquier
477efef089 pg_restore: Use dependency-based matching for STATISTICS DATA
The previous approach introduced by 0dd93de69e was weak in terms of
name matching, as an --index=foo could match with a table with the same
name but from a different schema, pulling in more data than necessary.

For example, imagine the following case:
CREATE SCHEMA s1;
CREATE SCHEMA s2;
CREATE TABLE s1.foo (id int);
INSERT INTO s1.foo SELECT generate_series(1,100);
ANALYZE s1.foo;
CREATE TABLE s2.bar (id int);
CREATE INDEX foo ON s2.bar(id);
INSERT INTO s2.bar SELECT generate_series(1,100);
ANALYZE s2.bar;

A targetted pg_restore --index=foo would grab the relation and attribute
stats of s1.foo on top of the index s2.foo, which is incorrect.  This
commit fixes this scenario by relying on a lookup of the dependencies of
a STATISTICS DATA TOC entry, checking if a TOC entry depends on an index
or another relkind before matching with the names of the objects wanted
for the restore.

Discussion: https://postgr.es/m/ajDBwpxs-otl585H@paquier.xyz
Backpatch-through: 18
2026-06-16 15:58:17 +09:00
Heikki Linnakangas
c3e36a9a5f Fix int32 overflow in ltree_compare()
The expression (len_diff * 10 * (an + 1)) used as the return value of
ltree_compare() is computed at int32 width.  With LTREE_MAX_LEVELS =
65535, the product can exceed INT32_MAX once an ltree has more than
~14,653 levels, which causes the result to wrap and invert its sign.
That corrupts btree ordering as well as the "magnitude" consumed by
ltree_penalty() for GiST page splits.

To fix, split ltree_compare() into two functions.  The new
ltree_compare_distance() function returns a float, which won't
overflow.  It's used by the ltree_penalty() caller.  All the other
callers only care about the sign of the return value, i.e. which of
the arguments is greater, so change ltree_compare() to not multiply
the result with (10 * (an + 1)), which avoids the overflow for those
callers.

Existing btree or GiST indexes on ltree columns containing values with
more than ~14,653 levels may be corrupt and should be REINDEXed.

Add a regression test based on the reporter's PoC.

Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Reported-by: 王跃林 <violin0613@tju.edu.cn>
Discussion: https://www.postgresql.org/message-id/AI6AnABgKW93Qbx1jVzi84r9.8.1781322625756.Hmail.3020001251%40tju.edu.cn
Backpatch-through: 14
2026-06-16 09:31:08 +03:00
Michael Paquier
54ffa74c99 pg_dump: Remove dead code in TAP tests
The schema_only_with_statistics test scenario was referenced in
002_pg_dump.pl, but was associated to no command sequence since
0ed92cf50c.

Issue discovered while investigating a different bug.  Perhaps this
cleanup is not worth backpatching, but there is also an argument in
favor of reducing noise when touching this area of the code in stable
branches.

Reviewed-by: Ewan Young <kdbase.hack@gmail.com>
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Discussion: https://postgr.es/m/ai-y0S7Z25NlrG_n@paquier.xyz
Backpatch-through: 18
2026-06-16 08:31:43 +09:00
Michael Paquier
42ffdedcf7 Fix inconsistencies with pg_restore --statistics[-only]
Attempting to restore a schema, a table or an index with
--only-statistics skipped all the statistics of the objects wanted.
Like for pg_dump, statistics should be included, so this created an
assymetry between dump and restore.

A second set of problems existed for --table and --index, where the
presence of --statistics skipped the restore of the stats of the
object(s) targetted.

This issue has been reported originally as related to an inconsistency
with the way extended stats restore is handled in Postgres v19, but the
issue is related to the restore of relation and attribute statistics in
v18.  Some TAP tests are added to cover all these cases.

Reported-by: Chao Li <li.evan.chao@gmail.com>
Author: Chao Li <li.evan.chao@gmail.com>
Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/66E80CAB-527C-42B1-BB65-3F82CF4AD998@gmail.com
Backpatch-through: 18
2026-06-16 08:22:41 +09:00
Tom Lane
abb5825550 Clean up quoting of variable strings within replication commands.
Our handling of quoting within replication commands was pretty
sloppy, typically looking like
        appendStringInfo(&cmd, " SLOT \"%s\"", options->slotname);
This is fine as long as options->slotname doesn't contain a double
quote mark, but what if it does?  In principle this'd allow injection
of harmful options into replication commands, in the probably-unlikely
case that a slot name comes from untrustworthy input.  We ought to
clean that up.

Moreover, even the places that were trying to be more careful
generally got it wrong, because they used quoting subroutines
intended for SQL commands rather than something that will work
with the replication-command scanner repl_scanner.l.  For example,
several places naively use PQescapeLiteral() to quote option values
for replication commands.  If the string contains a backslash,
PQescapeLiteral() will produce E'...' literal syntax, which
repl_scanner.l doesn't recognize.  Another near miss was to use
quote_identifier() to quote identifiers.  That function won't quote
valid lowercase identifiers unless they match SQL keywords ... but in
this context, replication keywords are what matter.  Neither of these
errors seem to risk string injection, but they definitely can cause
syntax errors in replication commands that ought to be valid.

We can clean all this up by using simple quoting logic that just
doubles single or double quotes respectively.

Or at least, we could if repl_scanner.l handled doubled double quotes
in identifiers, but for some reason it doesn't!  So the first step in
this fix has to be to fix that.  (The fact that we'll later reject
slot names containing double quotes is very far short of justifying
this omission.)

Having done that, this patch runs around and applies correct
quoting in all places that generate replication commands containing
strings coming from outside the immediate context.  Probably some
of these places are safe because of restrictions elsewhere, but it
seems best to just quote all the time.

This was originally reported as a security bug, which it could be
if replication slot names or parameters were to originate from
untrustworthy sources.  But the security team concluded that that
was a very improbable situation, so we're just going to fix this
as a regular bug.

Reported-by: Team Dhiutsa
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Discussion: https://postgr.es/m/1648659.1781287310@sss.pgh.pa.us
Backpatch-through: 14
2026-06-15 15:35:37 -04:00
Nathan Bossart
85e6624c06 doc: Fix "Prev" link.
Presently, the "Prev" link on the page for background workers sends
you to the middle of the previous chapter instead of the actual
previous page.  This appears to be caused by a libxml2 bug, but
regardless, a minimal fix is to change the link generation code to
use [position()=last()] instead of [last()] in the predicate on the
union of reverse axes.

Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Discussion: https://postgr.es/m/aim4AZorFKaC7Wrf%40nathan
Backpatch-through: 14
2026-06-15 12:16:38 -05:00
Tom Lane
6603e81e69 Modernize pg_bsd_indent's error/warning reporting code.
Late-model clang complains that these functions should be labeled
with "format(printf, 2, 3)", and it's right.  But let's go a bit
further and also make use of varargs, to remove duplication and
allow these functions to be used with non-integer input values.

Since no good deed goes unpunished, I had to also adjust a couple
of call sites.  They weren't wrong as-is, since the size_t-sized
arguments were coerced to int on the way into diag3().  But
without that, we have to adjust the format strings.

The point of this is to suppress compiler warnings, so back-patch
into branches containing pg_bsd_indent, even though there's no
functional change.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Discussion: https://postgr.es/m/1645041.1781283554@sss.pgh.pa.us
Backpatch-through: 16
2026-06-15 12:22:55 -04:00
Heikki Linnakangas
b1ab4bc52a Fix PQdescribePrepared with more than 7498 params
If a query has more than 7498 params, the ParameterDescription message
exceeds the 30000 byte limit on messages that are not specifically
marked as possibly being longer than that (VALID_LONG_MESSAGE_TYPE).
To fix, add ParameterDescription to the list.

Author: Ning Sun <classicning@gmail.com>
Discussion: https://www.postgresql.org/message-id/dbfb4b65-0aa8-470a-8b87-b6496160b28a@gmail.com
Backpatch-through: 14
2026-06-15 11:35:20 +03:00
Michael Paquier
e592535d22 Trim regression test expected output for xml
This commit reduces the number of expected output files for the "xml"
test from three to two (well, mostly one, see below for details).

xml_2.out existed to handle some differences in output due to libxml2
2.9.3, due to some error context missing (085423e3e3).  This file is
removed, by tweaking the XML inputs to trigger the same error patterns
for the problematic 2.9.3 and other libxml2 versions.  This part is
authored by Tom Lane.

xml_1.out (no libxml2 support) is reduced in size by adding an \if query
that exits the test early.  This still checks NO_XML_SUPPORT() through
xmlin().  The rest of the test is skipped if XML input cannot be
handled by the backend.  This part has been written by me.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/aiu6CXO67q-s70n5@paquier.xyz
Backpatch-through: 14
2026-06-15 11:37:55 +09:00
Tom Lane
b4db796b19 Doc: remove stale entry for removed aclitem[] ~ aclitem operator.
Commit 2f70fdb06 removed the deprecated containment operator
~(aclitem[],aclitem) from the catalogs, but missed removing its entry
from the documentation.  (Arguably the blame should fall on c62dd80cd,
which added this entry in contravention of the longstanding policy
that we don't document deprecated aliases in the first place.)

Author: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAOzEurQSyR5psWukyhUz1LtxyO55C2Vfp0Fmt8w2jGKxhszQmQ@mail.gmail.com
Backpatch-through: 14
2026-06-14 11:01:48 -04:00
Alexander Korotkov
897e794862 amcheck: Use correct varlena size accessor in bt_normalize_tuple()
bt_normalize_tuple() uses VARSIZE() to get the size of varlena, even though
it's not yet known, that it has a 4-byte header.  Fix this by replacing a
accessor with a universal VARSIZE_ANY().

Backpatch to all supported versions.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/7ckc7oka4bvafkf5bwlqs6ygrhlsbhz25ppozfch7zbuxcx3rf%40e4pr4oqenalc
Author: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Backpatch-through: 14
2026-06-14 04:05:50 +03:00
Andrew Dunstan
10e510423d Adjust cross-version upgrade tests for seg_out() fix
Commit 0e1f1ed157 taught seg_out() to print the certainty indicator
on an interval's upper boundary, but it was back-patched only as far
as v14.  When upgrading from an older release, the old server prints
the one test_seg row exercising that case ('4.6 .. ~7.0') without the
indicator, so the pre- and post-upgrade dumps do not match.  Make
AdjustUpgrade.pm delete just that row; seg's comparison function does
distinguish the certainty indicators, so the otherwise identical row
'4.6 .. 7.0' is unaffected.

Back-patch to all supported branches.

Per buildfarm members crake and fairywren.

Discussion: https://postgr.es/m/5ccbdbde-6467-4a10-bf4d-0be73a05ce8d@dunslane.net
2026-06-12 18:06:38 -04:00
Daniel Gustafsson
27cf3b5aff Fix compilation with OpenSSL 4
OpenSSL 4.0.0 changed some parameters and returnvalues to const, so
we need to update our declarations and subsequently cast away const-
ness from a few callsites to make libpq build without warnings. This
is tested with OpenSSL 1.1.1 through 4.0.0 as well as with LibreSSL.
No functional change is introduced, this commit only allows postgres
to be compiled against OpenSSL 4.0.0 without warnings.

There is also an errormessage change in OpenSSL 4.0.0 which needed
to be covered by our testharness.

This will be backpatched to all supported branches since they are
all equally likely to be built against OpenSSL 4.0.0 as it becomes
available in distributions.  Backpatching will be done once it has
been in master for a few days without issues.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/066B07BB-85FA-487C-BE8C-40F791CFC3C4@yesql.se
Backpatch-through: 14
2026-06-12 13:57:22 +02:00
Michael Paquier
0d145be2c3 Update expected regression test output for xml_2.out
This one has been forgotten in 8bf257aeba.  Per report from buildfarm
member massasauga.

Backpatch-through: 14
2026-06-12 12:39:11 +09:00
Michael Paquier
4bff3aa51c Fix second race with timeline selection during promotion
read_local_xlog_page_guts has the same race as logical_read_xlog_page:
RecoveryInProgress() can return true during promotion, impacting the
availability of the operations doing WAL page reads with this callback.

This problem is similar to eb4e7224a1 that has addressed the issue for
logical replication, impacting more areas of the code where this WAL
page callback can be used (same narrow window during promotion, same
availability issue):
- pg_walinspect.
- Slot advance (SQL function).
- Slot creation.

Repack workers (v19~) and 2PC files (since forever) can also use this
callback, but they are irrelevant as far as I know.  A test is added
with the SQL lookup functions.  This part relies on injection points,
and is backpatched down to v18, like the test added for eb4e7224a1.

This issue could probably be fixed as well in v14 and v15 for
pg_walinspect.  However, I also feel that there is a conservative
argument about consistency here due to the support of logical decoding
on standbys, so let's limit ourselves to v16 for now.  pg_walinspect is
used less in the field compared to the two other operations, making
addressing this problem less attractive in these two older branches.

Reported-by: Xuneng Zhou <xunengzhou@gmail.com>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/7daef094-abf3-4672-bc23-3df4763b16a3%40gmail.com
Backpatch-through: 16
2026-06-12 11:44:14 +09:00
Fujii Masao
556324c386 doc: fix reference for finding replication slots to drop
Commit a70bce43fb added instructions on how to recover if PostgreSQL
refuses to issue new transaction IDs because of imminent wraparound,
but when describing how to find replication slots that should be dropped,
it referred to pg_stat_replication where it should have referenced
pg_replication_slots.

In passing, decorate references to views with <structname> tags.

Backpatch to all supported versions.

Reported-By: Sanjaya Waruna <sanjaya.waruna@gmail.com>
Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/176767268098.1084085.10345048667224193115@wrigleys.postgresql.org
Backpatch-through: 14
2026-06-12 11:10:07 +09:00
Michael Paquier
4c777d6dd9 Fix handling of namespace nodes in xpath() (xml)
xpath() attempted to call xmlCopyNode() and xmlNodeDump() on a
XML_NAMESPACE_DECL, finishing with a confusing error:
=# SELECT xpath('//namespace::foo', '<root xmlns:foo="http://127.0.0.1"/>');
ERROR:  53200: could not copy node
CONTEXT:  SQL function "xpath" statement 1

xpath() is changed so as it goes through xmlXPathCastNodeToString()
instead, that is able to handle namespace nodes.  xml2 uses the same
solution.  This issue has been discovered while digging into
9d33a5a804.

Author: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aioT7ui_ZJ9RMlfM@paquier.xyz
Backpatch-through: 14
2026-06-12 10:25:49 +09:00
Fujii Masao
12c32bbc85 amcheck: Fix missing allequalimage corruption report
When amcheck validates that a B-Tree metapage's allequalimage flag
matches _bt_allequalimage(), it could fail to report corruption
unless one of the index key columns used interval_ops. As a result,
pg_amcheck could silently miss this corruption on other opclasses,
incorrectly reporting the index as valid.

The mistake was that bt_index_check_callback() kept ereport(ERROR)
inside the loop that scans key attributes for INTERVAL_BTREE_FAM_OID,
even though that loop is only needed to decide whether to add
the interval-specific hint. This commit moves ereport() out of the loop
so allequalimage mismatches are always reported, while still emitting
the hint for affected interval indexes.

Back-patch to v18, where d70b17636d introduced this regression
while moving the check into bt_index_check_callback().

Author: Chao Li <lic@highgo.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/011ACC9C-CB87-4160-ACE7-4ED57AB86E15@gmail.com
Backpatch-through: 18
2026-06-12 09:39:19 +09:00
Álvaro Herrera
35d9a62634
IS JSON/JSON(): Protect against expressions uncoercible to text
transformJsonParseArg() was not careful enough on generation of
transformed expressions when starting from expressions that are not
coercible to text but are in the string type category: it failed to
verify that coerce_to_target_type() succeeds, and returned a NULL
pointer.  This leads to a later NULL dereference and crash at executor
time.

This escaped noticed because it cannot happen for built-in types, all of
which have casts to text.  Only user-created types are potentially
problematic.

Fix by raising an error when a cast to text doesn't exist.

This mistake came in with commit 6ee30209a6.

Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Reported-by: Chi Zhang <798604270@qq.com>
Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com>
Backpatch-through: 16
Discussion: https://postgr.es/m/19491-7aafc221ec63f288@postgresql.org
2026-06-11 16:17:58 +02:00
Dean Rasheed
9108fed3ed Fix parsing of parenthesised OLD/NEW in RETURNING list.
When parsing expressions like (old).colname and (old).* in a RETURNING
list, the parser would lose track of the intended varreturningtype,
and therefore return incorrect results.

The root cause was code using GetNSItemByRangeTablePosn() to find a
namespace item from its rtindex and levelsup, without taking into
account returningtype, which would return the wrong namespace item.
Fix by adding a new function GetNSItemByVar() that does take
returningtype into account.

Backpatch to v18, where support for RETURNING OLD/NEW was added.

Bug: #19516
Reported-by: Marko Grujic <markoog@gmail.com>
Author: Marko Grujic <markoog@gmail.com>
Suggested-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/CAOvwyF2cO_5mAt=w=y-dFnaG5UkZ+3H8nSDoKF_iuWZHsU2ARg@mail.gmail.com
Backpatch-through: 18
2026-06-11 12:08:48 +01:00
Heikki Linnakangas
0004cab4dc seg: Fix seg_out() to preserve the upper boundary's certainty indicator
When printing the upper boundary of a seg interval, seg_out() decided
whether to emit the certainty indicator ('<', '>' or '~') by testing the
upper indicator (u_ext) for '<' and '>', but mistakenly tested the lower
indicator (l_ext) for '~'.  This is a copy-and-paste slip from the
symmetric code that prints the lower boundary a few lines above.

The consequences for valid input were:

  * A '~' on the upper boundary was dropped on output, e.g.
    '1.5 .. ~2.5'::seg printed as '1.5 .. 2.5'.

  * When the lower boundary carried '~' but the upper boundary had no
    indicator, the wrong test matched and sprintf(p, "%c", seg->u_ext)
    wrote a NUL byte (u_ext == '\0'), which truncated the result string
    and silently lost the entire upper boundary, e.g.
    '~6.5 .. 8.5'::seg printed as '~6.5 .. '.

Certainty indicators are documented to be preserved on output (they are
ignored by the operators, but kept as comments), so this broke the
input/output round-trip for the affected values.

The bug has existed since seg was added.  It went unnoticed because the
existing regression tests only exercised certainty indicators on
single-point segs, which are printed by a different branch of seg_out().
Add tests that place indicators on both boundaries of an interval.

Author: Ewan Young <kdbase.hack@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAON2xHPYeRRCEVAv8XfE18KsEsEHCiYcJ5fOsoxFuMEfpxF1=g@mail.gmail.com
Backpatch-through: 14
2026-06-11 12:34:26 +03:00
Michael Paquier
b4bd138504 Fix race with timeline selection in logical decoding during promotion
During promotion, there is a window where RecoveryInProgress() returns
true but the WAL segments of the old timeline have already been removed.
A logical decoding could pick up the old timeline in this window when
reading a page, failing with the following error:
ERROR: requested WAL segment ... has already been removed

This issue does not lead to any data correctness issue, as retrying to
decode the data works in follow-up decoding attempts.  It impacts
availability, though.  Other WAL page read callbacks have a similar
issue, this commit takes care of what should be the noisiest code path:
logical decoding with START_REPLICATION in a WAL sender.

A TAP test, based on an injection point waiting in the startup process
after the segments have been removed/recycled, is added.  This part is
backpatched down to v17.

This issue has been causing sporadic failures in the buildfarm, and
was reproducible manually.  This issue happens since logical decoding on
standbys exists, down to v16.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/7daef094-abf3-4672-bc23-3df4763b16a3@gmail.com
Backpatch-through: 16
2026-06-11 17:29:38 +09:00
Michael Paquier
91b57eadeb xml2: Fix crash with namespace nodes in xpath_nodeset()
pgxmlNodeSetToText() passed nodeTab[i]->doc to xmlNodeDump() without
checking the node type, which could cause a crash as a
XML_NAMESPACE_DECL maps to a xmlNs struct.  The passed-in code would
then be dereferenced in xmlNodeDump().

This commit switches the code to render XML_NAMESPACE_DECL nodes with
xmlXPathCastNodeToString(), like xpath_table().  Some tests are added,
written by me.

Author: Andrey Chernyy <andrey.cherny@tantorlabs.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20260611031436.5afde3cb@andrnote
Backpatch-through: 14
2026-06-11 14:29:22 +09:00
Fujii Masao
beb09e9117 Use correct type for catalog_xmin
Commit 85c17f6 mistakenly declared a variable storing catalog_xmin as
XLogRecPtr, even though catalog_xmin is a TransactionId.

This caused no functional issue, but the type was clearly incorrect.
Therefore, this commit fixes it to use the correct type TransactionId
instead, and backpatch to v17 where the issue was introduced.

Author: Imran Zaheer <imran.zhir@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CA+UBfa=mNeLt-4BFjEP4tqdDsnq+oMqqPr7fd9Wji2_9YXmQdA@mail.gmail.com
2026-06-09 08:19:10 +09:00
Jeff Davis
26bd362655 Guard against uninitialized default locale.
No known problem today, but defend against issues like dbf217c1c7 in
the future.

Discussion: https://postgr.es/m/d080287d8d2d14c246c86be2e9eb611fb6b27b11.camel@j-davis.com
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Backpatch-through: 17
2026-06-08 13:11:05 -07:00
Tom Lane
c090bef07d Remove inappropriate translation marker in getObjectIdentityParts().
Strings built by this function are not supposed to be subject to
NLS translation, but commit 6566133c5 missed that memo, so that
object identities like "membership of role %s in role %s" were
translated.
2026-06-08 15:24:02 -04:00
Jeff Davis
89e6484985 dict_synonym.c: remove incorrect outlen.
Previously, outlen was miscalculated if case_sensitive was false and
str_tolower() changed the byte length of the string. If outlen was too
large, pnstrdup() would stop at the NUL terminator, preventing
overrun. But if outlen was too small, it would cause truncation.

Fix by just removing outlen. It was only used in a single site, which
could just as well use pstrdup().

Discussion: https://postgre.es/m/1101e1a3afbbabb503317069c40374b82e6f4cac.camel@j-davis.com
Reviewed-by: Tristan Partin <tristan@partin.io>
Backpatch-through: 14
2026-06-08 11:47:40 -07:00
Tom Lane
11aed8d19c Fix missed checks for hashability of container-type equality.
The operators for array_eq, record_eq, range_eq, and multirange_eq
are all marked oprcanhash, but there's a pitfall: their hash functions
can fail at runtime if the contained type(s) are not hashable.
Therefore, the planner has to check hashability of the contained types
before deciding it can use hashing in these cases.  Not every place
had gotten this memo, and noplace at all had considered the issue
for ranges or multiranges.  In particular we could attempt to use
hashing for a ScalarArrayOpExpr on a container type when it won't
actually work, leading to "could not identify a hash function ..."
runtime failures.

For the most part we should fix this in the lookup functions provided
by lsyscache.c, to wit get_op_hash_functions and op_hashjoinable.
But there's a problem: get_op_hash_functions is not passed the input
data type it would need to check.  We mustn't change the API of that
exported function in a back-patched fix, and even if we wanted to,
its call sites in the executor mostly don't have easy access to the
required data type OID.  Fortunately, the executor call sites don't
actually need fixing, because it's expected that the planner verified
hashability before building a plan that requires it.  Therefore,
leave get_op_hash_functions as-is and invent a wrapper function
get_op_hash_functions_ext that does the additional checking needed
in the planner's uses.

We also need to fix hash_ok_operator (extending the fix in 647889667).

While at it, neaten up a couple of places in lookup_type_cache where
relevant code for multirange cases was written differently from the
code for other container types.

Note: while this touches pg_operator.dat, it's only to add oid_symbol
macros.  So there's no on-disk data change and no need for a
catversion bump.

Reported-by: Andrei Lepikhov <lepihov@gmail.com>
Author: Andrei Lepikhov <lepihov@gmail.com>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/ed221f95-f09b-4a9c-b05b-e1fed621ec87@gmail.com
Backpatch-through: 14
2026-06-08 11:48:17 -04:00
Nathan Bossart
be176e0a6d doc: Expand on proper use of refint.
The security team has received a couple of reports about potential
SQL injection via refint's trigger arguments.  We discussed this
while preparing CVE-2026-6637 and concluded that forcibly quoting
these arguments is more likely to break working code than to
prevent exploits.  Unlike data values, the table/column names come
from trigger arguments, and there is little reason for a trigger
author to put hostile inputs into those arguments.  So, let's
document it accordingly.

Reported-by: Nikolay Samokhvalov <nik@postgres.ai>
Reported-by: Alex Young <alex000young@gmail.com>
Reported-by: Satyanarayana Narlapuram <satyanarlapuram@gmail.com>
Suggested-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Christoph Berg <myon@debian.org>
Reviewed-by: Satyanarayana Narlapuram <satyanarlapuram@gmail.com>
Discussion: https://postgr.es/m/ahXP7z7nsfGPOZ3T%40nathan
Backpatch-through: 14
2026-06-08 10:33:52 -05:00
Fujii Masao
081434b0f5 ecpg: Reject multiple header items in GET/SET DESCRIPTOR
Previously, ecpg accepted multiple descriptor header items in GET DESCRIPTOR
and SET DESCRIPTOR, but generated broken C code when they were used.
Although the grammar allowed this syntax, the implementation did not actually
support it.

This commit tightens the ecpg grammar so the header form of GET/SET DESCRIPTOR
accepts only a single header item, matching the implementation and preventing
generation of broken C code.

Also update the documentation synopsis accordingly.

Backpatch to all supported versions.

Author: Masashi Kamura <kamura.masashi@fujitsu.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Lakshmi G <lakshmigcdac@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/OS9PR01MB13174AD7D1829D0644B6BB90E9447A@OS9PR01MB13174.jpnprd01.prod.outlook.com
Backpatch-through: 14
2026-06-08 17:10:25 +09:00
Michael Paquier
4154a14820 Fix memory leak in pgstat_progress_parallel_incr_param()
When called from a parallel worker, this function calls initStringInfo()
and pq_beginmessage(), causing a StringInfo allocation to happen twice.
pq_endmessage() frees only the second allocation, with each call leaking
~1 kB into the per-worker memory context.  This could cause a few
hundred megabytes worth of memory to pile up until the worker exits (the
message allocations happen in the parallel worker context), with the
situation being worse the longer a parallel worker runs.

Oversight in f1889729dd.

Author: Baji Shaik <baji.pgdev@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Tristan Partin <tristan@partin.io>
Discussion: https://postgr.es/m/CA+fm-RMopta1Dmq8udiU5sp+zwTvhUf4+xfbr3rZDfczH+p-xw@mail.gmail.com
Backpatch-through: 17
2026-06-08 15:29:19 +09:00
Michael Paquier
07a6c262be psql: Fix expanded aligned output
When a table's columns are narrower than the record header line, the
expanded aligned format produced misaligned output because the data
column width was not adjusted to match the record header width, leading
to output like:
+-[ RECORD 1 ]-+
| a | 10 |
| b | 20 |
+---+----+

This commit adjusts the output so as the column width match with the
header line, giving:
+-[ RECORD 1 ]-+
| a | 10       |
| b | 20       |
+---+----------+

Author: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAFj8pRCzGpsr9zTHbtTd4mGh2YPJqOEgLgt8JLiopuYA9_1xGw@mail.gmail.com
Backpatch-through: 14
2026-06-08 14:37:56 +09:00
Michael Paquier
2b09f8a911 pg_surgery: Fix off-by-one bug with heap offset
heap_force_common() declared a boolean array indexed with an
OffsetNumber for a size of MaxHeapTuplesPerPage.  OffsetNumbers are
1-based, so an input TID whose offset number equals MaxHeapTuplesPerPage
wrote one byte past the end of the stack array, crashing the server.

Like heapam_handler.c, this commit changes the array so as it uses a
0-based index, substracting one from the OffsetNumbers.

Reported-by: Wang Yuelin <violin0613@tju.edu.cn>
Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>
Discussion: https://postgr.es/m/20260604002256.40f1fd544@smtp.qiye.163.com
Backpatch-through: 14
2026-06-06 08:16:40 +09:00
Daniel Gustafsson
a5112c9b62 doc: Use groups instead of curves in TLS documentation
With TLS 1.3 the concept of curves was renamed to groups.  Update
our wording to use groups instead of curves to make it clear what
the underlying GUC can support.

This was extracted from a slightly larger patch which also renamed
variables to match the new terminology.  Given that we are in beta
this portion was however left as a future excercise.

Author: Evan Si <evsi@amazon.com>
Reviewed-by: Ewan Young <kdbase.hack@gmail.com>
Discussion: https://postgr.es/m/23C40DD6-1C47-46FC-A746-8A1D8530AD3E@amazon.com
Backpatch-through: 18
2026-06-05 22:16:42 +02:00
Nathan Bossart
79a506228b refint: Remove plan cache.
Presently, refint stores plans in a per-backend cache to avoid
re-preparing in each call.  This has a few problems.  For one,
check_foreign_key() embeds the new key values in its cascade-UPDATE
queries, so a cached plan reuses the values from preparation.
Also, the cache is never invalidated, so it can return stale
entries that cause other problems.  There may very well be more
bugs lurking.

We could spend a lot of time trying to address all these problems,
but this module is primarily intended as sample code, and by all
indications, it sees minimal use.  Furthermore, there is a growing
consensus for removing refint in v20.  However, since we'll need to
support it on the back-branches for a while longer, it probably
still makes sense to fix some of the more egregious bugs.

Therefore, let's just remove refint's plan cache entirely.  That
means we'll re-prepare on every call, but that seems quite unlikely
to bother anyone.  On v17 and older versions, the regression test
for triggers fails after this change, so I've borrowed pieces of
commit 8cfbdf8f4d to fix it.

Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Discussion: https://postgr.es/m/CAJTYsWXU%2BfhuzrEd_bnrxyGH3%2Bny8QRQC2QHf3ws6s9iki3c2Q%40mail.gmail.com
Backpatch-through: 14
2026-06-05 12:08:05 -05:00
Michael Paquier
273fe94852 Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE)
The NFC recomposition incorrectly included TBASE as a valid T syllable,
which is incorrect based on the Unicode specification (TBASE is one
below the start of the range, range beginning at U+11A8).

This would cause the TBASE to be silently swallowed in the
normalization, leading to an incorrect result.

A couple of regression tests are added to check more patterns with
Hangul recomposition and decomposition, on top of a test to check the
problem with TBASE.  Diego has submitted the code fix, and I have
written the tests.

Author: Diego Frias <mail@dzfrias.dev>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/B92ED640-7D4A-4505-B09F-3548F58CBB16@dzfrias.dev
Backpatch-through: 14
2026-06-05 07:50:12 +09:00
Tom Lane
c5194139cb Improve reporting of invalid weight symbols in setweight() et al.
This commit addresses two related issues:

tsvector_filter() assumed it could print an incorrect weight value
with %c.  This could result in an invalidly-encoded error message
if the database encoding is multibyte and the char value has its
high bit set.  Weight values that are ASCII control characters
could render illegibly too.  Fix by printing such values in octal
(\ooo), similarly to how charout() would render them.

tsvector_setweight() and tsvector_setweight_by_filter() reported
the same unrecognized-weight error condition with elog(), as though
it were an internal error.  That'd not translate, would produce an
unwanted XX000 SQLSTATE code, and also reported the bad value as a
decimal integer which seems unhelpful.  Fix by refactoring so that
all three functions share one copy of the code that interprets a
weight argument.

The invalid-encoding aspect seems to me (tgl) to justify
back-patching.

Author: Ewan Young <kdbase.hack@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAON2xHNaeLAUzRCXL5AmXLcXaSE_gWAVjWQRmLzc_oZ=1_Vf4Q@mail.gmail.com
Backpatch-through: 14
2026-06-04 12:24:51 -04:00
Tom Lane
0228d098ac Fix another case of indirectly casting away const.
Like 8f1791c61, this fixes a case of implicitly casting away
const by not treating the result of strrchr() on a const pointer
as const.  This was missed at the time because the machines
reporting those warnings weren't building with --with-llvm.

While here, clean up another infelicity: in the probably-
impossible case that the input string contains only one dot,
this function would call pnstrdup() with a length of -1
and thereby emit a module name equal to the function name.
It seems to me we should emit modname = NULL instead.

Also remove a useless Assert and two redundant assignments.

Back-patch, as 8f1791c61 was, so that users of back branches
don't see this warning when building with late-model gcc.

Reported-by: hubert depesz lubaczewski <depesz@depesz.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/aiGNJ89PBqvq2Yyz@depesz.com
Backpatch-through: 14
2026-06-04 11:37:43 -04:00
Alexander Korotkov
94c02de89c pg_dump: scope indAttNames per index in getIndexes()
getIndexes() declared indAttNames and nindAttNames in the outer
per-table loop, so the names collected for an index on expressions
were carried over to the next plain index in the same table.

This is an internal inconsistency rather than a user-facing bug.
dumpRelationStats_dumper() only walks indexes that have pg_statistic
rows, and ANALYZE only creates those for indexes with expressions,
so the second index in the affected pair is not visited and the stale
array is never consulted.

Fix by moving the two variables into the inner per-index loop so each
iteration starts with a clean slate.

Author: Maksim Melnikov <m.melnikov@postgrespro.ru>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Discussion: https://postgr.es/m/be5fc489-587e-421f-bbb8-adb43cfd50f4@postgrespro.ru
Backpatch-through: 17
2026-06-03 13:01:21 +03:00
Fujii Masao
f833c92077 Fix race in ReplicationSlotRelease() for ephemeral slots
When releasing an ephemeral replication slot, ReplicationSlotRelease()
drops the slot via ReplicationSlotDropAcquired().

However, after dropping the slot, ReplicationSlotRelease() continued
to use its local "slot" pointer, which still referenced the dropped
slot's former shared-memory entry. It could then update fields such as
effective_xmin in that entry.

Once an ephemeral slot has been dropped (via ReplicationSlotDropAcquired()),
its slot array entry can be reused immediately by another backend
creating a new slot. As a result, those updates could corrupt
the state of an unrelated replication slot.

Fix by skipping those shared-memory updates for phemeral slots and
performing them only for non-ephemeral slots, whose shared-memory
entries remain valid after release.

Backpatch to all supported versions.

Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Masao Fujii <masao.fujii@gmail.com>
Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/TY4PR01MB177184FF9EE916F577E1F554194082@TY4PR01MB17718.jpnprd01.prod.outlook.com
Backpatch-through: 14
2026-06-03 18:46:49 +09:00
Michael Paquier
b3f13c0324 Fix copy-paste error in hash_record_extended()
The code failed to initialize the second isnull argument passed to
FunctionCallInvoke().  This is harmless for existing in-core extended
hash support functions, since FunctionCallInvoke() does not use the
value (note that all the in-core extended hash functions are strict),
examining only the argument values.  However, extension-provided
extended hash functions could be affected if they inspect
PG_ARGISNULL(1).

Oversight in 01e658fa74.

Author: Man Zeng <zengman@halodbtech.com>
Discussion: https://postgr.es/m/tencent_7818173C01E01836109848C3@qq.com
Backpatch-through: 14
2026-06-03 12:47:26 +09:00
Richard Guo
cc0819e78a Fix wrong unsafe-flag test in check_output_expressions()
The check for window functions (point 4) guarded on the wrong bit: it
tested UNSAFE_NOTIN_DISTINCTON_CLAUSE while setting
UNSAFE_NOTIN_PARTITIONBY_CLAUSE.  Each check in this loop guards on
the same bit it is about to set, as an idempotency optimization, since
unsafeFlags[] is accumulated across the arms of a set operation and
there is no point recomputing a column's status once its bit is
present.

This is not a live bug.  When UNSAFE_NOTIN_PARTITIONBY_CLAUSE is
already set but UNSAFE_NOTIN_DISTINCTON_CLAUSE is not, the guard fails
to skip targetIsInAllPartitionLists() and recomputes it, but setting
the same bit again changes nothing.  When
UNSAFE_NOTIN_DISTINCTON_CLAUSE is already set, point 4 is skipped and
UNSAFE_NOTIN_PARTITIONBY_CLAUSE is left unset; but such a column is
already unsafe for pushdown via UNSAFE_NOTIN_DISTINCTON_CLAUSE, so the
outcome is unchanged.

To fix, test UNSAFE_NOTIN_PARTITIONBY_CLAUSE, matching the bit being
set and the pattern of the surrounding checks.

Back-patch to v15, where the buggy check was introduced.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAMbWs49Q_xnF_P2QSUyDzJ34MnrO7dh-cUAaK2HJPgSgh88NcA@mail.gmail.com
Backpatch-through: 15
2026-06-03 09:39:41 +09:00
Michael Paquier
1e9bc4074b psql: Fix issues with deferred errors in pipelines
When an error is raised while processing a Sync message in a pipeline,
like a deferred constraint violation, the error was not associated with
the piped command and was not counted in available_results.  This caused
assertion failures in discardAbortedPipelineResults(), keeping an
incorrect state at pipeline exit, because the code assumed that the
number of available and requested results would always be positive,
expecting all the counters to be 0 at the end of a pipeline.

This commit switches discardAbortedPipelineResults() and
ExecQueryAndProcessResults() to take a softer approach when consuming
and draining the results after an error.  If there are still piped syncs
in the pipeline when it ends, we now attempt to consume them before
leaving the pipeline mode.

Alexander has been able to reach two assertion failures through his
testing.  While investigating more this issue, I have bumped into two
more.  Most of these cases are covered by the regression tests added in
this commit, plus some cases with mixes of pipelines, deferred errors
and results fetched.  Some of the tests discussed (like the backend
termination one) could not be included in this commit but have been
tested manually.  Another test scenario discussed involved the injection
of an error state in the backend, that was able to trick libpq
internally and put its queue out of sync.  This scenario is not going to
happen in practice, but if we were to do something about it we would
need to make libpq understand that it needs to fail in some cases but
not block.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/19494-97a86d84fee71c47@postgresql.org
Backpatch-through: 18
2026-06-03 08:58:29 +09:00
Jacob Champion
380a8b2ea0 doc: Correct the timeline for OAuth's shutdown_cb
During original feature development, the OAuth validator shutdown
callback was invoked via before_shmem_exit(). That was changed to use a
reset callback before commit, but I forgot to update the documentation
for validator developers.

Correct this and backport to 18, where OAuth was introduced. The
callback is invoked whenever the server is "finished" with token
validation. (We make no stronger guarantees here, in the hopes that this
API might successfully navigate future multifactor authentication
support and/or changes to the server threading model.)

Reported-by: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAN4CZFOuMb_gnLvCwRdMybg_k8WRNJTjcij%2BPoQkuQHDUzxGWg%40mail.gmail.com
Backpatch-through: 18
2026-05-29 14:39:03 -07:00
Heikki Linnakangas
c8b4186d6e Use term "referenced" rather than "dependent" in dependency locking
Reported-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/20260528.114608.488039299811669368.horikyota.ntt@gmail.com
Backpatch-through: 14
2026-05-28 21:29:10 +03:00
Andres Freund
c0bf1d89df Make stack depth check work with asan's use-after-return
With address sanitizer's stack-use-after-return check, stack variables are
moved to heap allocations, to allow to detect references to the memory at a
later time. That broke our stack-depth check, which is why we had to disable
detect_stack_use_after_return in CI. Luckily __builtin_frame_address() works
correctly, even under asan, so use that.

We started using __builtin_frame_address() with de447bb8e6, however as of
that commit we just used it for the stack base address, not for the value to
compare to the base address.  Now we use it for both.

When building without __builtin_frame_address() support, we continue to use
stack variables for the stack depth determination.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2kk4z4odvuyrg7qlwjd7ft4eron4cle4btb33v4qatgsdkayir@gj6e62rgsel4
Backpatch-through: 14
2026-05-28 11:34:11 -04:00
Fujii Masao
e5d019fbdc postgres_fdw, dblink: Validate use_scram_passthrough values
The use_scram_passthrough option in postgres_fdw and dblink accepts
only boolean values. However, unlike other boolean options such as
keep_connections, its value was not previously validated.

As a result, commands such as
"CREATE SERVER ... OPTIONS (use_scram_passthrough 'invalid')"
could succeed unexpectedly.

This commit updates postgres_fdw and dblink to validate that
use_scram_passthrough is assigned a valid boolean value, and throw an
error for invalid input.

Backpatch to v18, where use_scram_passthrough was introduced.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwF+-k-Ehsu5W94ZP7GxS3wiBd+mi0PfGTdJ_i2Yr0zR3g@mail.gmail.com
Backpatch-through: 18
2026-05-28 20:58:45 +09:00
Masahiko Sawada
1a9b1cc18e Fix race between ProcSignalInit() and EmitProcSignalBarrier().
Previously, ProcSignalInit() read the global barrier generation before
publishing its PID into pss_pid. This created a race condition: a
process could initialize its local generation with an older global
value, while a concurrent EmitProcSignalBarrier() might skip that
process because its pss_pid was still zero. This resulted in
WaitForProcSignalBarrier() hanging indefinitely.

Fix this by publishing pss_pid before reading psh_barrierGeneration
with a memory barrier so that the store to pss_pid is ordered before
the load. A concurrent EmitProcSignalBarrier() then either observes
the published PID and signals this slot, or completes its generation
increment before we load it.

While this race has become more visible due to recent features using
signal barriers in more places (such as online wal_level changes), the
issue is theoretically present since signal barriers were introduced
to release smgr caches (e.g., in DROP DATABASE). v14 has the
procsiangl barrier infrastricutre but no in-tree caller that actually
emits a barrier, so the case is unreachable there.

This issue was also reported by buildfarm member flaviventris.

Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAEze2WgAJmWReDN7Chtba8Er2YBvKCoa0KVN25-1evnTrHsLyA@mail.gmail.com
Backpatch-through: 15
2026-05-27 16:25:59 -07:00
Heikki Linnakangas
c8cd3d6976 Avoid orphaned objects dependencies
Concurrent DDL can leave behind objects referencing other objects that
no longer exist. This can happen if an object is dropped, while a new
object that depends on it is created concurrently. For example:

session 1: BEGIN; CREATE FUNCTION myschema.myfunc() ...;
session 2: DROP SCHEMA myschema;
session 1: COMMIT;

DROP SCHEMA does check that there are no objects dependending on the
schema being dropped, but it does not see objects being concurrently
created by other sessions. Even if it did, this scenario would still
fail:

session 1: BEGIN: DROP SCHEMA myschema;
session 2: CREATE FUNCTION myschema.myfunc() ...;
session 1: COMMIT;

When the DROP SCHEMA runs, the schema was empty, but the new function
is created in it before the dropping transaction completes. The CREATE
FUNCTION does not see that the schema is concurrently being dropped.

In both of these scenarios, the function is left behind in the schema
that no longer exists.

To fix, acquire AccessShareLock on all referenced objects when
recording dependencies. This conflicts with the AccessExclusiveLock
taken by DROP, preventing the race. After acquiring the lock, verify
that the object still exists, and if it was dropped concurrently,
report an error. We already had such a mechanism for shared
dependencies, but for some reason we didn't do it for in-database
dependendies.

Ideally the locks would be acquired much earlier when creating a new
object, but that will require modifying a lot of callers. This check
while recording the dependency is a nice wholesale protection, and
even if we change all the CREATE commands to acquire locks earlier,
it's still good to have this as a backstop to catch any cases where we
forgot to do so.

The patch adds a few tests for some cases that left behind orphaned
objects before this. It also adds a test for roles, which already had
such protection, although that test is partially disabled because the
error message includes an OID which is not predictable.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Discussion: https://postgr.es/m/ZiYjn0eVc7pxVY45@ip-10-97-1-34.eu-west-3.compute.internal
Backpatch-through: 14
2026-05-27 18:36:28 +03:00
Heikki Linnakangas
f9d5a52da4 Don't try to record dependency on a dropped column's datatype
When creating a relation with a dropped column, we called
recordDependencyOn() also on the datatype of the dropped column, which
is always InvalidOid. In versions 15 and above, that was harmless
because recordDependencyOn() considers InvalidOid as a pinned object,
and skips over it. On version 14, isPinnedObject() does not consider
InvalidOid as pinned, so we created a bogus pg_depend entry with
refobjectid == 0.

As far as I can tell, the only case when AddNewAttributeTuples() is
called with dropped columns is when performing a table-rewriting ALTER
TABLE command. That temporarily creates a new relation with the same
columns, including dropped ones, then swaps the relations, and drops
the newly created table again. So even on version 14, the bogus
pg_depend entry was only on the transient relation that was dropped at
the end of the ALTER TABLE command, which was harmless.

Even though this is harmless, let's be tidy, similar to commit
713bce9484. The reason I noticed this now and why I backported this,
is because the next commit will add code to acquire locks on the
referenced objects, and we don't want to acquire a lock on InvalidOid.

Discussion: https://postgr.es/m/ZiYjn0eVc7pxVY45@ip-10-97-1-34.eu-west-3.compute.internal
Backpatch-through: 14
2026-05-27 18:36:25 +03:00