Commit graph

53911 commits

Author SHA1 Message Date
Thomas Munro
9b354cfc55 jit: No backport::SectionMemoryManager for LLVM 22.
LLVM 22 has the fix that we copied into our tree in commit 9044fc1d and
a new function to reach it[1][2], so we only need to use our copy for
Aarch64 + LLVM < 22.  The only change to the final version that our copy
didn't get is a new LLVM_ABI macro, but that isn't appropriate for us.
Our copy is hopefully now frozen and would only need maintenance if bugs
are found in the upstream code.

Non-Aarch64 systems now also use the new API with LLVM 22.  It allocates
all sections with one contiguous mmap() instead of one per
section.  We could have done that earlier, but commit 9044fc1d wanted to
limit the blast radius to the affected systems.  We might as well
benefit from that small improvement everywhere now that it is available
out of the box.

We can't delete our copy until LLVM 22 is our minimum supported version,
or we switch to the newer JITLink API for at least Aarch64.

[1] https://github.com/llvm/llvm-project/pull/71968
[2] https://github.com/llvm/llvm-project/pull/174307

Backpatch-through: 14
Discussion: https://postgr.es/m/CA%2BhUKGJTumad75o8Zao-LFseEbt%3DenbUFCM7LZVV%3Dc8yg2i7dg%40mail.gmail.com
2026-04-03 15:03:10 +13:00
Thomas Munro
b4e7cd428c jit: Stop emitting lifetime.end for LLVM 22.
The lifetime.end intrinsic can now only be used for stack memory
allocated with alloca[1][2][3].  We use it to tell LLVM about the
lifetime of function arguments/isnull values that we keep in palloc'd
memory, so that it can avoid spilling registers to memory.

We might need to rearrange things and put them on the stack, but that'll
take some research.  In the meantime, unbreak the build on LLVM 22.

[1] https://github.com/llvm/llvm-project/pull/149310
[2] https://llvm.org/docs/LangRef.html#llvm-lifetime-end-intrinsic
[3] https://llvm.org/docs/LangRef.html#i-alloca

Backpatch-through: 14
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> (earlier attempt)
Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> (earlier attempt)
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier attempt)
Discussion: https://postgr.es/m/CA%2BhUKGJTumad75o8Zao-LFseEbt%3DenbUFCM7LZVV%3Dc8yg2i7dg%40mail.gmail.com
2026-04-02 15:55:46 +13:00
Nathan Bossart
9a53249c5b doc: Add missing description for DROP SUBSCRIPTION IF EXISTS.
Oversight in commit 665d1fad99.

Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAHut%2BPv72haFerrCdYdmF6hu6o2jKcGzkXehom%2BsP-JBBmOVDg%40mail.gmail.com
Backpatch-through: 14
2026-04-01 09:48:48 -05:00
Tom Lane
7cd23aad2a Be more careful to preserve consistency of a tuplestore.
Several places in tuplestore.c would leave the tuplestore data
structure effectively corrupt if some subroutine were to throw
an error.  Notably, if WRITETUP() failed after some number of
successful calls within dumptuples(), the tuplestore would
contain some memtuples pointers that were apparently live
entries but in fact pointed to pfree'd chunks.

In most cases this sort of thing is fine because transaction
abort cleanup is not too picky about the contents of memory that
it's going to throw away anyway.  There's at least one exception
though: if a Portal has a holdStore, we're going to call
tuplestore_end() on that, even during transaction abort.
So it's not cool if that tuplestore is corrupt, and that means
tuplestore.c has to be more careful.

This oversight demonstrably leads to crashes in v15 and before,
if a holdable cursor fails to persist its data due to an undersized
temp_file_limit setting.  Very possibly the same thing can happen in
v16 and v17 as well, though the specific test case submitted failed
to fail there (cf. 095555daf).  The failure is accidentally dodged
as of v18 because 590b045c3 got rid of tuplestore_end's retail tuple
deletion loop.  Still, it seems unwise to permit tuplestores to become
internally inconsistent in any branch, so I've applied the same fix
across the board.

Since the known test case for this is rather expensive and doesn't
fail in recent branches, I've omitted it.

Bug: #19438
Reported-by: Dmitriy Kuzmin <kuzmin.db4@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/19438-9d37b179c56d43aa@postgresql.org
Backpatch-through: 14
2026-03-30 13:59:54 -04:00
David Rowley
6ce5c310ba Fix datum_image_*()'s inability to detect sign-extension variations
Functions such as hash_numeric() are not careful to use the correct
PG_RETURN_*() macro according to the return type of that function as
defined in pg_proc.  Because that function is meant to return int32,
when the hashed value exceeds 2^31, the 64-bit Datum value won't wrap to
a negative number, which means the Datum won't have the same value as it
would have had it been cast to int32 on a two's complement machine.  This
isn't harmless as both datum_image_eq() and datum_image_hash() may receive
a Datum that's been formed and deformed from a tuple in some cases, and
not in other cases.  When formed into a tuple, the Datum value will be
coerced into an integer according to the attlen as specified by the
TupleDesc.  This can result in two Datums that should be equal being
classed as not equal, which could result in (but not limited to) an error
such as:

ERROR:  could not find memoization table entry

Here we fix this by ensuring we cast the Datum value to a signed integer
according to the typLen specified in the datum_image_eq/datum_image_hash
function call before comparing or hashing.

Author: David Rowley <dgrowleyml@gmail.com>
Reported-by: Tender Wang <tndrwang@gmail.com>
Backpatch-through: 14
Discussion: https://postgr.es/m/CAHewXNmcXVFdB9_WwA8Ez0P+m_TQy_KzYk5Ri5dvg+fuwjD_yw@mail.gmail.com
2026-03-30 16:18:13 +13:00
Heikki Linnakangas
a6d03067ff Avoid memory leak on error while parsing pg_stat_statements dump file
By using palloc() instead of raw malloc().

Reported-by: Gaurav Singh <gaurav.singh@yugabyte.com>
Reviewed-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/CAEcQ1bYR9s4eQLFDjzzJHU8fj-MTbmRpW-9J-r2gsCn+HEsynw@mail.gmail.com
Backpatch-through: 14
2026-03-27 12:23:38 +02:00
Fujii Masao
bf7ecf3531 Fix premature NULL lag reporting in pg_stat_replication
pg_stat_replication is documented to keep the last measured lag values for
a short time after the standby catches up, and then set them to NULL when
there is no WAL activity. However, previously lag values could become NULL
prematurely even while WAL activity was ongoing, especially in logical
replication.

This happened because the code cleared lag when two consecutive reply messages
indicated that the apply location had caught up with the send location.
It did not verify that the reported positions were unchanged, so lag could be
cleared even when positions had advanced between messages. In logical
replication, where the apply location often quickly catches up, this issue was
more likely to occur.

This commit fixes the issue by clearing lag only when the standby reports that
it has fully replayed WAL (i.e., both flush and apply locations have caught up
with the send location) and the write/flush/apply positions remain unchanged
across two consecutive reply messages.

The second message with unchanged positions typically results from
wal_receiver_status_interval, so lag values are cleared after that interval
when there is no activity. This avoids showing stale lag data while preventing
premature NULL values.

Even with this fix, lag may rarely become NULL during activity if identical
position reports are sent repeatedly. Eliminating such duplicate messages
would address this fully, but that change is considered too invasive for stable
branches and will be handled in master only later.

Backpatch to all supported branches.

Author: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com
Backpatch-through: 14
2026-03-26 20:50:43 +09:00
John Naylor
4dc9346531 Fix copy-paste error in test_ginpostinglist
The check for a mismatch on the second decoded item pointer
was an exact copy of the first item pointer check, comparing
orig_itemptrs[0] with decoded_itemptrs[0] instead of orig_itemptrs[1]
with decoded_itemptrs[1].  The error message also reported (0, 1) as
the expected value instead of (blk, off).  As a result, any decoding
error in the second item pointer (where the varbyte delta encoding
is exercised) would go undetected.

This has been wrong since commit bde7493d1, so backpatch to all
supported versions.

Author: Jianghua Yang <yjhjstz@gmail.com>
Discussion: https://postgr.es/m/CAAZLFmSOD8R7tZjRLZsmpKtJLoqjgawAaM-Pne1j8B_Q2aQK8w@mail.gmail.com
Backpatch-through: 14
2026-03-24 17:17:47 +07:00
Heikki Linnakangas
e35e466f61 Fix multixact backwards-compatibility with CHECKPOINT race condition
If a CHECKPOINT record with nextMulti N is written to the WAL before
the CREATE_ID record for N, and N happens to be the first multixid on
an offset page, the backwards compatibility logic to tolerate WAL
generated by older minor versions (before commit 789d65364c) failed to
compensate for the missing XLOG_MULTIXACT_ZERO_OFF_PAGE record. In
that case, the latest_page_number was initialized at the start of WAL
replay to the page for nextMulti from the CHECKPOINT record, even if
we had not seen the CREATE_ID record for that multixid yet, which
fooled the backwards compatibility logic to think that the page was
already initialized.

To fix, track the last XLOG_MULTIXACT_ZERO_OFF_PAGE that we've seen
separately from latest_page_number. If we haven't seen any
XLOG_MULTIXACT_ZERO_OFF_PAGE records yet, use
SimpleLruDoesPhysicalPageExist() to check if the page needs to be
initialized.

Reported-by: duankunren.dkr <duankunren.dkr@alibaba-inc.com>
Analyzed-by: duankunren.dkr <duankunren.dkr@alibaba-inc.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://www.postgresql.org/message-id/c4ef1737-8cba-458e-b6fd-4e2d6011e985.duankunren.dkr@alibaba-inc.com
Backpatch-through: 14-18
2026-03-23 12:03:30 +02:00
Jeff Davis
c6f369e585 Fix dependency on FDW handler.
ALTER FOREIGN DATA WRAPPER could drop the dependency on the handler
function if it wasn't explicitly specified.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/35c44a4b7fb76d35418c4d66b775a88f4ce60c86.camel@j-davis.com
Backpatch-through: 14
2026-03-19 15:01:45 -07:00
Fujii Masao
f154714646 Fix WAL flush LSN used by logical walsender during shutdown
Commit 6eedb2a5fd made the logical walsender call
XLogFlush(GetXLogInsertRecPtr()) to ensure that all pending WAL is flushed,
fixing a publisher shutdown hang. However, if the last WAL record ends at
a page boundary, GetXLogInsertRecPtr() can return an LSN pointing past
the page header, which can cause XLogFlush() to report an error.

A similar issue previously existed in the GiST code. Commit b1f14c9672
introduced GetXLogInsertEndRecPtr(), which returns a safe WAL insertion end
location (returning the start of the page when the last record ends at a page
boundary), and updated the GiST code to use it with XLogFlush().

This commit fixes the issue by making the logical walsender use
XLogFlush(GetXLogInsertEndRecPtr()) when flushing pending WAL during shutdown.

Backpatch to all supported versions.

Reported-by: Andres Freund <andres@anarazel.de>
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/vzguaguldbcyfbyuq76qj7hx5qdr5kmh67gqkncyb2yhsygrdt@dfhcpteqifux
Backpatch-through: 14
2026-03-17 08:12:46 +09:00
Tomas Vondra
ff78b8fac4 Tighten asserts on ParallelWorkerNumber
The comment about ParallelWorkerNumbr in parallel.c says:

  In parallel workers, it will be set to a value >= 0 and < the number
  of workers before any user code is invoked; each parallel worker will
  get a different parallel worker number.

However asserts in various places collecting instrumentation allowed
(ParallelWorkerNumber == num_workers). That would be a bug, as the value
is used as index into an array with num_workers entries.

Fixed by adjusting the asserts accordingly. Backpatch to all supported
versions.

Discussion: https://postgr.es/m/5db067a1-2cdf-4afb-a577-a04f30b69167@vondra.me
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Backpatch-through: 14
2026-03-14 15:34:37 +01:00
Tomas Vondra
c0ffc725f8 Use GetXLogInsertEndRecPtr in gistGetFakeLSN
The function used GetXLogInsertRecPtr() to generate the fake LSN. Most
of the time this is the same as what XLogInsert() would return, and so
it works fine with the XLogFlush() call. But if the last record ends at
a page boundary, GetXLogInsertRecPtr() returns LSN pointing after the
page header. In such case XLogFlush() fails with errors like this:

  ERROR: xlog flush request 0/01BD2018 is not satisfied --- flushed only to 0/01BD2000

Such failures are very hard to trigger, particularly outside aggressive
test scenarios.

Fixed by introducing GetXLogInsertEndRecPtr(), returning the correct LSN
without skipping the header. This is the same as GetXLogInsertRecPtr(),
except that it calls XLogBytePosToEndRecPtr().

Initial investigation by me, root cause identified by Andres Freund.

This is a long-standing bug in gistGetFakeLSN(), probably introduced by
c6b92041d3 in PG13. Backpatch to all supported versions.

Reported-by: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/vf4hbwrotvhbgcnknrqmfbqlu75oyjkmausvy66ic7x7vuhafx@e4rvwavtjswo
Backpatch-through: 14
2026-03-13 23:26:49 +01:00
Michael Paquier
70f7c5badc xml2: Fix failure with xslt_process() under -fsanitize=undefined
The logic of xslt_process() has never considered the fact that
xsltSaveResultToString() would return NULL for an empty string (the
upstream code has always done so, with a string length of 0).  This
would cause memcpy() to be called with a NULL pointer, something
forbidden by POSIX.

Like 46ab07ffda and similar fixes, this is backpatched down to all the
supported branches, with a test case to cover this scenario.  An empty
string has been always returned in xml2 in this case, based on the
history of the module, so this is an old issue.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/c516a0d9-4406-47e3-9087-5ca5176ebcf9@gmail.com
Backpatch-through: 14
2026-03-13 16:06:51 +09:00
Fujii Masao
a84a9a4260 doc: Document IF NOT EXISTS option for ALTER FOREIGN TABLE ADD COLUMN.
Commit 2cd40adb85 added the IF NOT EXISTS option to ALTER TABLE ADD COLUMN.
This also enabled IF NOT EXISTS for ALTER FOREIGN TABLE ADD COLUMN,
but the ALTER FOREIGN TABLE documentation was not updated to mention it.

This commit updates the documentation to describe the IF NOT EXISTS option for
ALTER FOREIGN TABLE ADD COLUMN.

While updating that section, also this commit clarifies that the COLUMN keyword
is optional in ALTER FOREIGN TABLE ADD/DROP COLUMN. Previously, part of
the documentation could be read as if COLUMN were required.

This commit adds regression tests covering these ALTER FOREIGN TABLE syntaxes.

Backpatch to all supported versions.

Suggested-by: Fujii Masao <masao.fujii@gmail.com>
Author: Chao Li <lic@highgo.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwFk=rrhrwGwPtQxBesbT4DzSZ86Q3ftcwCu3AR5bOiXLw@mail.gmail.com
Backpatch-through: 14
2026-03-09 18:25:20 +09:00
Michael Paquier
eb11d7a915 Fix size underestimation of DSA pagemap for odd-sized segments
When make_new_segment() creates an odd-sized segment, the pagemap was
only sized based on a number of usable_pages entries, forgetting that a
segment also contains metadata pages, and that the FreePageManager uses
absolute page indices that cover the entire segment.  This
miscalculation could cause accesses to pagemap entries to be out of
bounds.  During subsequent reuse of the allocated segment, allocations
landing on pages with indices higher than usable_pages could cause
out-of-bounds pagemap reads and/or writes.  On write, 'span' pointers
are stored into the data area, corrupting the allocated objects.  On
read (aka during a dsa_free), garbage is interpreted as a span pointer,
typically crashing the server in dsa_get_address().

The normal geometric path correctly sizes the pagemap for all pages in
the segment.  The odd-sized path needs to do the same, but it works
forward from usable_pages rather than backward from total_size.

This commit fixes the sizing of the odd-sized case by adding pagemap
entries for the metadata pages after the initial metadata_bytes
calculation, using an integer ceiling division to compute the exact
number of additional entries needed in one go, avoiding any iteration in
the calculation.

An assertion is added in the code path for odd-sized segments, ensuring
that the pagemap includes the metadata area, and that the result is
appropriately sized.

This problem would show up depending on the size requested for the
allocation of a DSA segment.  The reporter has noticed this issue when a
parallel hash join makes a DSA allocation large enough to trigger the
odd-sized segment path, but it could happen for anything that does a DSA
allocation.

A regression test is added to test_dsa, down to v17 where the test
module has been introduced.  This adds a set of cheap tests to check the
problem, the new assertion being useful for this purpose.  Sami has
proposed a test that took a longer time than what I have done here; the
test committed is faster and good enough to check the odd-sized
allocation path.

Author: Paul Bunn <paul.bunn@icloud.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/044401dcabac$fe432490$fac96db0$@icloud.com
Backpatch-through: 14
2026-03-09 13:46:38 +09:00
Fujii Masao
3bf6f22ce1 Fix publisher shutdown hang caused by logical walsender busy loop.
Previously, when logical replication was running, shutting down
the publisher could cause the logical walsender to enter a busy loop
and prevent the publisher from completing shutdown.

During shutdown, the logical walsender waits for all pending WAL
to be written out. However, some WAL records could remain unflushed,
causing the walsender to wait indefinitely.

The issue occurred because the walsender used XLogBackgroundFlush() to
flush pending WAL. This function does not guarantee that all WAL is written.
For example, WAL generated by a transaction without an assigned
transaction ID that aborts might not be flushed.

This commit fixes the bug by making the logical walsender call XLogFlush()
instead, ensuring that all pending WAL is written and preventing
the busy loop during shutdown.

Backpatch to all supported versions.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAO6_Xqo3co3BuUVEVzkaBVw9LidBgeeQ_2hfxeLMQcXwovB3GQ@mail.gmail.com
Backpatch-through: 14
2026-03-06 16:45:05 +09:00
Fujii Masao
921f4a15c1 doc: Clarify that COLUMN is optional in ALTER TABLE ... ADD/DROP COLUMN.
In ALTER TABLE ... ADD/DROP COLUMN, the COLUMN keyword is optional. However,
part of the documentation could be read as if COLUMN were required, which may
mislead users about the command syntax.

This commit updates the ALTER TABLE documentation to clearly state that
COLUMN is optional for ADD and DROP.

Also this commit adds regression tests covering ALTER TABLE ... ADD/DROP
without the COLUMN keyword.

Backpatch to all supported versions.

Author: Chao Li <lic@highgo.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAEoWx2n6ShLMOnjOtf63TjjgGbgiTVT5OMsSOFmbjGb6Xue1Bw@mail.gmail.com
Backpatch-through: 14
2026-03-05 12:58:51 +09:00
Michael Paquier
1346794feb Fix rare instability in recovery TAP test 004_timeline_switch
This fixes a problem similar to ad8c86d22c.  In this case, the test
could fail under the following circumstances:
- The primary is stopped with teardown_node(), meaning that it may not
be able to send all its WAL records to standby_1 and standby_2.
- If standby_2 receives more records than standby_1, attempting to
reconnect standby_2 to the promoted standby_1 would fail because of a
timeline fork.

This race condition is fixed with a simple trick: instead of tearing
down the primary, it is stopped cleanly so as all the WAL records of the
primary are received and flushed by both standby_1 and standby_2.  Once
we do that, there is no need for a wait_for_catchup() before stopping
the node.  The test wants to check that a timeline jump can be achieved
when reconnecting a standby to a promoted standby in the same cluster,
hence an immediate stop of the primary is not required.

This failure is harder to reach than the previous instability of
009_twophase, still the buildfarm has been able to detect this failure
at least once.  I have tried Alexander Lakhin's test trick with the
bgwriter and very aggressive standby snapshots, but I could not
reproduce it directly.  It is reachable, as the buildfarm has proved.

Backpatch down to all supported branches, and this problem can lead to
spurious failures in the buildfarm.

Discussion: https://postgr.es/m/493401a8-063f-436a-8287-a235d9e065fc@gmail.com
Backpatch-through: 14
2026-03-05 10:06:08 +09:00
Álvaro Herrera
e0a9c64635
Don't malloc(0) in EventTriggerCollectAlterTSConfig
Author: Florin Irion <florin.irion@enterprisedb.com>
Discussion: https://postgr.es/m/c6fff161-9aee-4290-9ada-71e21e4d84de@gmail.com
2026-03-04 15:04:53 +01:00
Heikki Linnakangas
ffe53037df Skip prepared_xacts test if max_prepared_transactions < 2
This reduces maintenance overhead, as we no longer need to update the
dummy expected output file every time the .sql file changes.

Discussion: https://www.postgresql.org/message-id/1009073.1772551323@sss.pgh.pa.us
Backpatch-through: 14
2026-03-04 11:21:48 +02:00
Michael Paquier
bce98e49e8 Fix rare instability in recovery TAP test 009_twophase
The phase of the test where we want to check that 2PC transactions
prepared on a primary can be committed on a promoted standby relied on
an immediate stop of the primary.  This logic has a race condition: it
could be possible that some records (most likely standby snapshot
records) are generated on the primary before it finishes its shutdown,
without the promoted standby know about them.  When the primary is
recycled as new standby, the test could fail because of a timeline fork
as an effect of these extra records.

This fix takes care of the instability by doing a clean stop of the
primary instead of a teardown (aka immediate stop), so as all records
generated on the primary are sent to the promoted standby and flushed
there.  There is no need for a teardown of the primary in this test
scenario: the commit of 2PC transactions on a promoted standby do not
care about the state of the primary, only of the standby.

This race is very hard to hit in practice, even slow buildfarm members
like skink have a very low rate of reproduction.  Alexander Lakhin has
come up with a recipe to improve the reproduction rate a lot:
- Enable -DWAL_DEBUG.
- Patch the bgwriter so as standby snapshots are generated every
milliseconds.
- Run 009_twophase tests under heavy parallelism.

With this method, the failure appears after a couple of iterations.
With the fix in place, I have been able to run more than 50 iterations
of the parallel test sequence, without seeing a failure.

Issue introduced in 30820982b2, due to a copy-pasto coming from the
surrounding tests.  Thanks also to Hayato Kuroda for digging into the
details of the failure.  He has proposed a fix different than the one of
this commit.  Unfortunately, it relied on injection points, feature only
available in v17.  The solution of this commit is simpler, and can be
applied to v14~v16.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/b0102688-6d6c-c86a-db79-e0e91d245b1a@gmail.com
Backpatch-through: 14
2026-03-04 16:31:05 +09:00
Fujii Masao
7ab6ac1d7f doc: Clarify that empty COMMENT string removes the comment.
Clarify the documentation of COMMENT ON to state that specifying an empty
string is treated as NULL, meaning that the comment is removed.

This makes the behavior explicit and avoids possible confusion about how
empty strings are handled.

Also adds regress test cases that use empty string to remove a comment.

Backpatch to all supported versions.

Author: Chao Li <lic@highgo.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Shengbin Zhao <zshengbin91@gmail.com>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Reviewed-by: zhangqiang <zhang_qiang81@163.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/26476097-B1C1-4BA8-AA92-0AD0B8EC7190@gmail.com
Backpatch-through: 14
2026-03-03 14:47:41 +09:00
Michael Paquier
858c83b3e6 test_custom_types: Test module with fancy custom data types
This commit adds a new test module called "test_custom_types", that can
be used to stress code paths related to custom data type
implementations.

Currently, this is used as a test suite to validate the set of fixes
done in 3b7a6fa157, that requires some typanalyze callbacks that can
force very specific backend behaviors, as of:
- typanalyze callback that returns "false" as status, to mark a failure
in computing statistics.
- typanalyze callback that returns "true" but let's the backend know
that no interesting stats could be computed, with stats_valid set to
"false".

This could be extended more in the future if more problems are found.
For simplicity, the module uses a fake int4 data type, that requires a
btree operator class to be usable with extended statistics.  The type is
created by the extension, and its properties are altered in the test.

Like 3b7a6fa157, this module is backpatched down to v14, for coverage
purposes.

Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aaDrJsE1I5mrE-QF@paquier.xyz
Backpatch-through: 14
2026-03-02 11:10:41 +09:00
Michael Paquier
038c7d4a3b Fix set of issues with extended statistics on expressions
This commit addresses two defects regarding extended statistics on
expressions:
- When building extended statistics in lookup_var_attr_stats(), the call
to examine_attribute() did not account for the possibility of a NULL
return value.  This can happen depending on the behavior of a typanalyze
callback — for example, if the callback returns false, if no rows are
sampled, or if no statistics are computed.  In such cases, the code
attempted to build MCV, dependency, and ndistinct statistics using a
NULL pointer, incorrectly assuming valid statistics were available,
which could lead to a server crash.
- When loading extended statistics for expressions,
statext_expressions_load() did not account for NULL entries in the
pg_statistic array storing expression statistics.  Such NULL entries can
be generated when statistics collection fails for an expression, as may
occur during the final step of serialize_expr_stats().  An extended
statistics object defining N expressions requires N corresponding
elements in the pg_statistic array stored for the expressions, and some
of these elements can be NULL.  This situation is reachable when a
typanalyze callback returns true, but sets stats_valid to indicate that
no useful statistics could be computed.

While these scenarios cannot occur with in-core typanalyze callbacks, as
far as I have analyzed, they can be triggered by custom data types with
custom typanalyze implementations, at least.

No tests are added in this commit.  A follow-up commit will introduce a
test module that can be extended to cover similar edge cases if
additional issues are discovered.  This takes care of the core of the
problem.

Attribute and relation statistics already offer similar protections:
- ANALYZE detects and skips the build of invalid statistics.
- Invalid catalog data is handled defensively when loading statistics.

This issue exists since the support for extended statistics on
expressions has been added, down to v14 as of a4d75c86bf.  Backpatch
to all supported stable branches.

Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aaDrJsE1I5mrE-QF@paquier.xyz
Backpatch-through: 14
2026-03-02 09:38:47 +09:00
Etsuro Fujita
e72bde9017 postgres_fdw: Fix thinko in comment for UserMappingPasswordRequired().
This commit also rephrases this comment to improve readability.

Oversight in commit 6136e94dc.

Reported-by: Etsuro Fujita <etsuro.fujita@gmail.com>
Author: Andreas Karlsson <andreas@proxel.se>
Co-authored-by: Etsuro Fujita <etsuro.fujita@gmail.com>
Discussion: https://postgr.es/m/CAPmGK16pDnM_wU3kmquPj-M9MYqG3y0BdntRZ0eytqbCaFY3WQ%40mail.gmail.com
Backpatch-through: 14
2026-02-27 17:05:06 +09:00
Jeff Davis
058710d415 Fix more multibyte issues in ltree.
Commit 84d5efa7e3 missed some multibyte issues caused by short-circuit
logic in the callers. The callers assumed that if the predicate string
is longer than the label string, then it couldn't possibly be a match,
but it can be when using case-insensitive matching (LVAR_INCASE) if
casefolding changes the byte length.

Fix by refactoring to get rid of the short-circuit logic as well as
the function pointer, and consolidate the logic in a replacement
function ltree_label_match().

Discussion: https://postgr.es/m/02c6ef6cf56a5013ede61ad03c7a26affd27d449.camel@j-davis.com
Backpatch-through: 14
2026-02-26 12:26:32 -08:00
Tom Lane
746dae69c3 Fix Solution.pm for change in pg_config.h contents.
In commits 1d97e4788 et al, I forgot that pre-v17 branches
require updating Solution.pm when changing the set of
symbols generated in pg_config.h.  Per buildfarm.
2026-02-26 12:26:52 -05:00
Tom Lane
231570d335 Use CXXFLAGS instead of CFLAGS for linking C++ code
Otherwise, this would break if using C and C++ compilers from
different families and they understand different options.  It already
used the right flags for compiling, this is only for linking.  Also,
the meson setup already did this correctly.

Back-patch of v18 commit 365b5a345 into older supported branches.
At the time we were only aware of trouble in v18, but as shown
by buildfarm member siren, older branches can hit the problem too.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/228700.1722717983@sss.pgh.pa.us
Discussion: https://postgr.es/m/3109540.1771698685@sss.pgh.pa.us
Backpatch-through: 14-17
2026-02-26 12:06:58 -05:00
Noah Misch
e056563066 EUC_CN, EUC_JP, EUC_KR, EUC_TW: Skip U+00A0 tests instead of failing.
Settings that ran the new test euc_kr.sql to completion would fail these
older src/pl tests.  Use alternative expected outputs, for which psql
\gset and \if have reduced the maintenance burden.  This fixes
"LANG=ko_KR.euckr LC_MESSAGES=C make check-world".  (LC_MESSAGES=C fixes
IO::Pty usage in tests 010_tab_completion and 001_password.)  That file
is new in commit c67bef3f32.  Back-patch
to v14, like that commit.

Discussion: https://postgr.es/m/20260217184758.da.noahmisch@microsoft.com
Backpatch-through: 14
2026-02-25 18:13:45 -08:00
Fujii Masao
ec84a1f16f doc: Clarify INCLUDING COMMENTS behavior in CREATE TABLE LIKE.
The documentation for the INCLUDING COMMENTS option of the LIKE clause
in CREATE TABLE was inaccurate and incomplete. It stated that comments for
copied columns, constraints, and indexes are copied, but regarding comments
on constraints in reality only comments on CHECK and NOT NULL constraints
are copied; comments on other constraints (such as primary keys) are not.
In addition, comments on extended statistics are copied, but this was not
documented.

The CREATE FOREIGN TABLE documentation had a similar omission: comments
on extended statistics are also copied, but this was not mentioned.

This commit updates the documentation to clarify the actual behavior.
The CREATE TABLE reference now specifies that comments on copied columns,
CHECK constraints, NOT NULL constraints, indexes, and extended statistics are
copied. The CREATE FOREIGN TABLE reference now notes that comments on
extended statistics are copied as well.

Backpatch to all supported versions. Documentation updates related to
CREATE FOREIGN TABLE LIKE and NOT NULL constraint comment copying are
not applied to v17 and earlier, since those features were introduced in v18.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwHSOSGcaYDvHF8EYCUCfGPjbRwGFsJ23cx5KbJ1X6JouQ@mail.gmail.com
Backpatch-through: 14
2026-02-26 09:03:39 +09:00
Fujii Masao
5d2dec77ef Fix ProcWakeup() resetting wrong waitStart field.
Previously, when one process woke another that was waiting on a lock,
ProcWakeup() incorrectly cleared its own waitStart field (i.e.,
MyProc->waitStart) instead of that of the process being awakened.
As a result, the awakened process retained a stale lock-wait start timestamp.

This did not cause user-visible issues. pg_locks.waitstart was reported as
NULL for the awakened process (i.e., when pg_locks.granted is true),
regardless of the waitStart value.

This bug was introduced by commit 46d6e5f567.

This commit fixes this by resetting the waitStart field of the process
being awakened in ProcWakeup().

Backpatch to all supported branches.

Reported-by: Chao Li <li.evan.chao@gmail.com>
Author: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: ji xu <thanksgreed@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/537BD852-EC61-4D25-AB55-BE8BE46D07D7@gmail.com
Backpatch-through: 14
2026-02-26 08:51:05 +09:00
Tom Lane
ff9bd96754 Allow PG_PRINTF_ATTRIBUTE to be different in C and C++ code.
Although clang claims to be compatible with gcc's printf format
archetypes, this appears to be a falsehood: it likes __syslog__
(which gcc does not, on most platforms) and doesn't accept
gnu_printf.  This means that if you try to use gcc with clang++
or clang with g++, you get compiler warnings when compiling
printf-like calls in our C++ code.  This has been true for quite
awhile, but it's gotten more annoying with the recent appearance
of several buildfarm members that are configured like this.

To fix, run separate probes for the format archetype to use with the
C and C++ compilers, and conditionally define PG_PRINTF_ATTRIBUTE
depending on __cplusplus.

(We could alternatively insist that you not mix-and-match C and
C++ compilers; but if the case works otherwise, this is a poor
reason to insist on that.)

This commit back-patches 0909380e4 into supported branches.

Discussion: https://postgr.es/m/986485.1764825548@sss.pgh.pa.us
Discussion: https://postgr.es/m/3988414.1771950285@sss.pgh.pa.us
Backpatch-through: 14-18
2026-02-25 11:57:26 -05:00
Tom Lane
244c047205 Fix some cases of indirectly casting away const.
Newest versions of gcc+glibc are able to detect cases where code
implicitly casts away const by assigning the result of strchr() or
a similar function applied to a "const char *" value to a target
variable that's just "char *".  This of course creates a hazard of
not getting a compiler warning about scribbling on a string one was
not supposed to, so fixing up such cases is good.

This patch fixes a dozen or so places where we were doing that.
Most are trivial additions of "const" to the target variable,
since no actually-hazardous change was occurring.

Thanks to Bertrand Drouvot for finding a couple more spots than
I had.

This commit back-patches relevant portions of 8f1791c61 and
9f7565c6c into supported branches.  However, there are two
places in ecpg (in v18 only) where a proper fix is more
complicated than seems appropriate for a back-patch.  I opted
to silence those two warnings by adding casts.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/1324889.1764886170@sss.pgh.pa.us
Discussion: https://postgr.es/m/3988414.1771950285@sss.pgh.pa.us
Backpatch-through: 14-18
2026-02-25 11:19:50 -05:00
Tom Lane
5d311d6a84 Stabilize output of new isolation test insert-conflict-do-update-4.
The test added by commit 4b760a181 assumed that a table's physical
row order would be predictable after an UPDATE.  But a non-heap table
AM might produce some other order.  Even with heap AM, the assumption
seems risky; compare a3fd53bab for instance.  Adding an ORDER BY is
cheap insurance and doesn't break any goal of the test.

Author: Pavel Borisov <pashkin.elfe@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CALT9ZEHcE6tpvumScYPO6pGk_ASjTjWojLkodHnk33dvRPHXVw@mail.gmail.com
Backpatch-through: 14
2026-02-25 10:51:42 -05:00
Jacob Champion
c47744ede0 pg_upgrade: Use max_protocol_version=3.0 for older servers
The grease patch in 4966bd3ed found its first problem: prior to the
February 2018 patch releases, no server knew how to negotiate protocol
versions, so pg_upgrade needs to take that into account when speaking to
those older servers.

This will be true even after the grease feature is reverted; we don't
need anyone to trip over this again in the future. Backpatch so that all
supported versions of pg_upgrade can gracefully handle an update to the
default protocol version. (This is needed for any distributions that
link older binaries against newer libpqs, such as Debian.) Branches
prior to 18 need an additional version check, for the existence of
max_protocol_version.

Per buildfarm member crake.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAOYmi%2B%3D4QhCjssfNEoZVK8LPtWxnfkwT5p-PAeoxtG9gpNjqOQ%40mail.gmail.com
Backpatch-through: 14
2026-02-24 14:01:58 -08:00
Tom Lane
966473719c Stamp 14.22. 2026-02-23 17:03:30 -05:00
Peter Eisentraut
7917762558 Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: abd36b494cbc149993ae9127e3ce4d0e9c10ed7a
2026-02-23 14:06:33 +01:00
Tom Lane
f4f0be89f0 Release notes for 18.3, 17.9, 16.13, 15.17, 14.22. 2026-02-22 12:22:41 -05:00
Thomas Munro
88fec079f1 Fix test_valid_server_encoding helper function.
Commit c67bef3f32 introduced this test helper function for use by
src/test/regress/sql/encoding.sql, but its logic was incorrect.  It
confused an encoding ID for a boolean so it gave the wrong results for
some inputs, and also forgot the usual return macro.  The mistake didn't
affect values actually used in the test, so there is no change in
behavior.

Also drop it and another missed function at the end of the test, for
consistency.

Backpatch-through: 14
Author: Zsolt Parragi <zsolt.parragi@percona.com>
2026-02-17 16:22:36 +13:00
Noah Misch
a0769e74d6 Suppress new "may be used uninitialized" warning.
Various buildfarm members, having compilers like gcc 8.5 and 6.3, fail
to deduce that text_substring() variable "E" is initialized if
slice_size!=-1.  This suppression approach quiets gcc 8.5; I did not
reproduce the warning elsewhere.  Back-patch to v14, like commit
9f4fd119b2.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1157953.1771266105@sss.pgh.pa.us
Backpatch-through: 14
2026-02-16 18:05:03 -08:00
Michael Paquier
f604cc695c hstore: Fix NULL pointer dereference with receive function
The receive function of hstore was not able to handle correctly
duplicate key values when a new duplicate links to a NULL value, where a
pfree() could be attempted on a NULL pointer, crashing due to a pointer
dereference.

This problem would happen for a COPY BINARY, when stacking values like
that:
aa => 5
aa => null

The second key/value pair is discarded and pfree() calls are attempted
on its key and its value, leading to a pointer dereference for the value
part as the value is NULL.  The first key/value pair takes priority when
a duplicate is found.

Per offline report.

Reported-by: "Anemone" <vergissmeinnichtzh@gmail.com>
Reported-by: "A1ex" <alex000young@gmail.com>
Backpatch-through: 14
2026-02-17 08:41:37 +09:00
Heikki Linnakangas
547a8aaa7d Don't reset 'latest_page_number' when replaying multixid truncation
'latest_page_number' is set to the correct value, according to
nextOffset, early at system startup. Contrary to the comment, it hence
should be set up correctly by the time we get to WAL replay.

This fixes a failure to replay WAL generated on older minor versions,
before commit 789d65364c (18.2, 17.8, 16.12, 15.16, 14.21). The
failure occurs after a truncation record has been replayed and looks
like this:

    FATAL:  could not access status of transaction 858112
    DETAIL:  Could not read from file "pg_multixact/offsets/000D" at offset 24576: read too few bytes.
    CONTEXT:  WAL redo at 3/2A3AB408 for MultiXact/CREATE_ID: 858111 offset 6695072 nmembers 5: 1048228 (sh) 1048271 (keysh) 1048316 (sh) 1048344 (keysh) 1048370 (sh)

Reported-by: Sebastian Webber <sebastian@swebber.me>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://www.postgresql.org/message-id/20260214090150.GC2297@p46.dedyn.io;lightning.p46.dedyn.io
Backpatch-through: 14-18
2026-02-16 17:20:56 +02:00
Michael Paquier
c60a582044 pgcrypto: Tweak error message for incorrect session key length
The error message added in 379695d3cc referred to the public key being
too long.  This is confusing as it is in fact the session key included
in a PGP message which is too long.  This is harmless, but let's be
precise about what is wrong.

Per offline report.

Reported-by: Zsolt Parragi <zsolt.parragi@percona.com>
Backpatch-through: 14
2026-02-16 12:18:34 +09:00
Noah Misch
14b1fd6176 Fix SUBSTRING() for toasted multibyte characters.
Commit 1e7fe06c10 changed
pg_mbstrlen_with_len() to ereport(ERROR) if the input ends in an
incomplete character.  Most callers want that.  text_substring() does
not.  It detoasts the most bytes it could possibly need to get the
requested number of characters.  For example, to extract up to 2 chars
from UTF8, it needs to detoast 8 bytes.  In a string of 3-byte UTF8
chars, 8 bytes spans 2 complete chars and 1 partial char.

Fix this by replacing this pg_mbstrlen_with_len() call with a string
traversal that differs by stopping upon finding as many chars as the
substring could need.  This also makes SUBSTRING() stop raising an
encoding error if the incomplete char is past the end of the substring.
This is consistent with the general philosophy of the above commit,
which was to raise errors on a just-in-time basis.  Before the above
commit, SUBSTRING() never raised an encoding error.

SUBSTRING() has long been detoasting enough for one more char than
needed, because it did not distinguish exclusive and inclusive end
position.  For avoidance of doubt, stop detoasting extra.

Back-patch to v14, like the above commit.  For applications using
SUBSTRING() on non-ASCII column values, consider applying this to your
copy of any of the February 12, 2026 releases.

Reported-by: SATŌ Kentarō <ranvis@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Bug: #19406
Discussion: https://postgr.es/m/19406-9867fddddd724fca@postgresql.org
Backpatch-through: 14
2026-02-14 12:16:21 -08:00
Noah Misch
44fc85bbf9 pg_mblen_range, pg_mblen_with_len: Valgrind after encoding ereport.
The prior order caused spurious Valgrind errors.  They're spurious
because the ereport(ERROR) non-local exit discards the pointer in
question.  pg_mblen_cstr() ordered the checks correctly, but these other
two did not.  Back-patch to v14, like commit
1e7fe06c10.

Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20260214053821.fa.noahmisch@microsoft.com
Backpatch-through: 14
2026-02-14 12:16:21 -08:00
Tom Lane
2b93d38205 Fix plpgsql's handling of "return simple_record_variable".
If the variable's value is null, exec_stmt_return() missed filling
in estate->rettype.  This is a pretty old bug, but we'd managed not
to notice because that value isn't consulted for a null result ...
unless we have to cast it to a domain.  That case led to a failure
with "cache lookup failed for type 0".

The correct way to assign the data type is known by exec_eval_datum.
While we could copy-and-paste that logic, it seems like a better
idea to just invoke exec_eval_datum, as the ROW case already does.

Reported-by: Pavel Stehule <pavel.stehule@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAFj8pRBT_ahexDf-zT-cyH8bMR_qcySKM8D5nv5MvTWPiatYGA@mail.gmail.com
Backpatch-through: 14
2026-02-11 16:53:14 -05:00
Heikki Linnakangas
82b495cdd7 Fix pg_stat_get_backend_wait_event() for aux processes
The pg_stat_activity view shows information for aux processes, but the
pg_stat_get_backend_wait_event() and
pg_stat_get_backend_wait_event_type() functions did not. To fix, call
AuxiliaryPidGetProc(pid) if BackendPidGetProc(pid) returns NULL, like
we do in pg_stat_get_activity().

In version 17 and above, it's a little silly to use those functions
when we already have the ProcNumber at hand, but it was necessary
before v17 because the backend ID was different from ProcNumber. I
have other plans for wait_event_info on master, so it doesn't seem
worth applying a different fix on different versions now.

Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/c0320e04-6e85-4c49-80c5-27cfb3a58108@iki.fi
Backpatch-through: 14
2026-02-11 18:51:32 +02:00
Tom Lane
ec9f0cbe01 Further stabilize a postgres_fdw test case.
The buildfarm occasionally shows a variant row order in the output
of this UPDATE ... RETURNING, implying that the preceding INSERT
dropped one of the rows into some free space within the table rather
than appending them all at the end.  It's not entirely clear why that
happens some times and not other times, but we have established that
it's affected by concurrent activity in other databases of the
cluster.  In any case, the behavior is not wrong; the test is at fault
for presuming that a seqscan will give deterministic row ordering.
Add an ORDER BY atop the update to stop the buildfarm noise.

The buildfarm seems to have shown this only in v18 and master
branches, but just in case the cause is older, back-patch to
all supported branches.

Discussion: https://postgr.es/m/3866274.1770743162@sss.pgh.pa.us
Backpatch-through: 14
2026-02-11 11:03:01 -05:00
Dean Rasheed
3586c5e789 doc: Mention all SELECT privileges required by INSERT ... ON CONFLICT.
On the INSERT page, mention that SELECT privileges are also required
for any columns mentioned in the arbiter clause, including those
referred to by the constraint, and clarify that this applies to all
forms of ON CONFLICT, not just ON CONFLICT DO UPDATE.

Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Viktor Holmberg <v@viktorh.net>
Discussion: https://postgr.es/m/CAEZATCXGwMQ+x00YY9XYG46T0kCajH=21QaYL9Xatz0dLKii+g@mail.gmail.com
Backpatch-through: 14
2026-02-11 10:53:03 +00:00