LLVM 22 has the fix that we copied into our tree in commit 9044fc1d and
a new function to reach it[1][2], so we only need to use our copy for
Aarch64 + LLVM < 22. The only change to the final version that our copy
didn't get is a new LLVM_ABI macro, but that isn't appropriate for us.
Our copy is hopefully now frozen and would only need maintenance if bugs
are found in the upstream code.
Non-Aarch64 systems now also use the new API with LLVM 22. It allocates
all sections with one contiguous mmap() instead of one per
section. We could have done that earlier, but commit 9044fc1d wanted to
limit the blast radius to the affected systems. We might as well
benefit from that small improvement everywhere now that it is available
out of the box.
We can't delete our copy until LLVM 22 is our minimum supported version,
or we switch to the newer JITLink API for at least Aarch64.
[1] https://github.com/llvm/llvm-project/pull/71968
[2] https://github.com/llvm/llvm-project/pull/174307
Backpatch-through: 14
Discussion: https://postgr.es/m/CA%2BhUKGJTumad75o8Zao-LFseEbt%3DenbUFCM7LZVV%3Dc8yg2i7dg%40mail.gmail.com
Several places in tuplestore.c would leave the tuplestore data
structure effectively corrupt if some subroutine were to throw
an error. Notably, if WRITETUP() failed after some number of
successful calls within dumptuples(), the tuplestore would
contain some memtuples pointers that were apparently live
entries but in fact pointed to pfree'd chunks.
In most cases this sort of thing is fine because transaction
abort cleanup is not too picky about the contents of memory that
it's going to throw away anyway. There's at least one exception
though: if a Portal has a holdStore, we're going to call
tuplestore_end() on that, even during transaction abort.
So it's not cool if that tuplestore is corrupt, and that means
tuplestore.c has to be more careful.
This oversight demonstrably leads to crashes in v15 and before,
if a holdable cursor fails to persist its data due to an undersized
temp_file_limit setting. Very possibly the same thing can happen in
v16 and v17 as well, though the specific test case submitted failed
to fail there (cf. 095555daf). The failure is accidentally dodged
as of v18 because 590b045c3 got rid of tuplestore_end's retail tuple
deletion loop. Still, it seems unwise to permit tuplestores to become
internally inconsistent in any branch, so I've applied the same fix
across the board.
Since the known test case for this is rather expensive and doesn't
fail in recent branches, I've omitted it.
Bug: #19438
Reported-by: Dmitriy Kuzmin <kuzmin.db4@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/19438-9d37b179c56d43aa@postgresql.org
Backpatch-through: 14
Before the major rewrite in commit c6e0fe1f2, AllocSetFree() would
typically crash when asked to free an already-free chunk. That was
an ugly but serviceable way of detecting coding errors that led to
double pfrees. But since that rewrite, double pfrees went through
just fine, because the "hdrmask" of a freed chunk isn't changed at all
when putting it on the freelist. We'd end with a corrupt freelist
that circularly links back to the doubly-freed chunk, which would
usually result in trouble later, far removed from the actual bug.
This situation is no good at all for debugging purposes. Fortunately,
we can fix it at low cost in MEMORY_CONTEXT_CHECKING builds by making
AllocSetFree() check for chunk->requested_size == InvalidAllocSize,
relying on the pre-existing code that sets it that way just below.
I investigated the alternative of changing a freed chunk's methodid
field, which would allow detection in non-MEMORY_CONTEXT_CHECKING
builds too. But that adds measurable overhead. Seeing that we didn't
notice this oversight for more than three years, it's hard to argue
that detecting this type of bug is worth any extra overhead in
production builds.
Likewise fix AllocSetRealloc() to detect repalloc() on a freed chunk,
and apply similar changes in generation.c and slab.c. (generation.c
would hit an Assert failure anyway, but it seems best to make it act
like aset.c.) bump.c doesn't need changes since it doesn't support
pfree in the first place. Ideally alignedalloc.c would receive
similar changes, but in debugging builds it's impossible to reach
AlignedAllocFree() or AlignedAllocRealloc() on a pfreed chunk, because
the underlying context's pfree would have wiped the chunk header of
the aligned chunk. But that means we should get an error of some
sort, so let's be content with that.
Per investigation of why the test case for bug #19438 didn't appear to
fail in v16 and up, even though the underlying bug was still present.
(This doesn't fix the underlying double-free bug, just cause it to
get detected.)
Bug: #19438
Reported-by: Dmitriy Kuzmin <kuzmin.db4@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/19438-9d37b179c56d43aa@postgresql.org
Backpatch-through: 16
Functions such as hash_numeric() are not careful to use the correct
PG_RETURN_*() macro according to the return type of that function as
defined in pg_proc. Because that function is meant to return int32,
when the hashed value exceeds 2^31, the 64-bit Datum value won't wrap to
a negative number, which means the Datum won't have the same value as it
would have had it been cast to int32 on a two's complement machine. This
isn't harmless as both datum_image_eq() and datum_image_hash() may receive
a Datum that's been formed and deformed from a tuple in some cases, and
not in other cases. When formed into a tuple, the Datum value will be
coerced into an integer according to the attlen as specified by the
TupleDesc. This can result in two Datums that should be equal being
classed as not equal, which could result in (but not limited to) an error
such as:
ERROR: could not find memoization table entry
Here we fix this by ensuring we cast the Datum value to a signed integer
according to the typLen specified in the datum_image_eq/datum_image_hash
function call before comparing or hashing.
Author: David Rowley <dgrowleyml@gmail.com>
Reported-by: Tender Wang <tndrwang@gmail.com>
Backpatch-through: 14
Discussion: https://postgr.es/m/CAHewXNmcXVFdB9_WwA8Ez0P+m_TQy_KzYk5Ri5dvg+fuwjD_yw@mail.gmail.com
astreamer_tar_parser_content() sent the wrong data pointer when
forwarding MEMBER_TRAILER padding to the next streamer. After
astreamer_buffer_until() buffers the padding bytes, the 'data'
pointer has been advanced past them, but the code passed 'data'
instead of bbs_buffer.data. This caused the downstream consumer
to receive bytes from after the padding rather than the padding
itself, and could read past the end of the input buffer.
astreamer_gzip_decompressor_content() only checked for
Z_STREAM_ERROR from inflate(), silently ignoring Z_DATA_ERROR
(corrupted data) and Z_MEM_ERROR (out of memory). Fix by
treating any return other than Z_OK, Z_STREAM_END, and
Z_BUF_ERROR as fatal.
astreamer_gzip_decompressor_free() missed calling inflateEnd() to
release zlib's internal decompression state.
astreamer_tar_parser_free() neglected to pfree() the streamer
struct itself, leaking it.
astreamer_extractor_content() did not check the return value of
fclose() when closing an extracted file. A deferred write error
(e.g., disk full on buffered I/O) would be silently lost.
Discussion: https://postgr.es/m/results/98c6b630-acbb-44a7-97fa-1692ce2b827c@dunslane.net
Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us>
Backpatch-through: 15
pg_stat_replication is documented to keep the last measured lag values for
a short time after the standby catches up, and then set them to NULL when
there is no WAL activity. However, previously lag values could become NULL
prematurely even while WAL activity was ongoing, especially in logical
replication.
This happened because the code cleared lag when two consecutive reply messages
indicated that the apply location had caught up with the send location.
It did not verify that the reported positions were unchanged, so lag could be
cleared even when positions had advanced between messages. In logical
replication, where the apply location often quickly catches up, this issue was
more likely to occur.
This commit fixes the issue by clearing lag only when the standby reports that
it has fully replayed WAL (i.e., both flush and apply locations have caught up
with the send location) and the write/flush/apply positions remain unchanged
across two consecutive reply messages.
The second message with unchanged positions typically results from
wal_receiver_status_interval, so lag values are cleared after that interval
when there is no activity. This avoids showing stale lag data while preventing
premature NULL values.
Even with this fix, lag may rarely become NULL during activity if identical
position reports are sent repeatedly. Eliminating such duplicate messages
would address this fully, but that change is considered too invasive for stable
branches and will be handled in master only later.
Backpatch to all supported branches.
Author: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com
Backpatch-through: 14
The check for a mismatch on the second decoded item pointer
was an exact copy of the first item pointer check, comparing
orig_itemptrs[0] with decoded_itemptrs[0] instead of orig_itemptrs[1]
with decoded_itemptrs[1]. The error message also reported (0, 1) as
the expected value instead of (blk, off). As a result, any decoding
error in the second item pointer (where the varbyte delta encoding
is exercised) would go undetected.
This has been wrong since commit bde7493d1, so backpatch to all
supported versions.
Author: Jianghua Yang <yjhjstz@gmail.com>
Discussion: https://postgr.es/m/CAAZLFmSOD8R7tZjRLZsmpKtJLoqjgawAaM-Pne1j8B_Q2aQK8w@mail.gmail.com
Backpatch-through: 14
If a CHECKPOINT record with nextMulti N is written to the WAL before
the CREATE_ID record for N, and N happens to be the first multixid on
an offset page, the backwards compatibility logic to tolerate WAL
generated by older minor versions (before commit 789d65364c) failed to
compensate for the missing XLOG_MULTIXACT_ZERO_OFF_PAGE record. In
that case, the latest_page_number was initialized at the start of WAL
replay to the page for nextMulti from the CHECKPOINT record, even if
we had not seen the CREATE_ID record for that multixid yet, which
fooled the backwards compatibility logic to think that the page was
already initialized.
To fix, track the last XLOG_MULTIXACT_ZERO_OFF_PAGE that we've seen
separately from latest_page_number. If we haven't seen any
XLOG_MULTIXACT_ZERO_OFF_PAGE records yet, use
SimpleLruDoesPhysicalPageExist() to check if the page needs to be
initialized.
Reported-by: duankunren.dkr <duankunren.dkr@alibaba-inc.com>
Analyzed-by: duankunren.dkr <duankunren.dkr@alibaba-inc.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://www.postgresql.org/message-id/c4ef1737-8cba-458e-b6fd-4e2d6011e985.duankunren.dkr@alibaba-inc.com
Backpatch-through: 14-18
Send the correct amount of data to the next astreamer, not the
whole allocated buffer size. This bug escaped detection because
in present uses the next astreamer is always a tar-file parser
which is insensitive to trailing garbage. But that may not
be true in future uses.
Author: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2178517.1774064942@sss.pgh.pa.us
Backpatch-through: 15
Commit 6eedb2a5fd made the logical walsender call
XLogFlush(GetXLogInsertRecPtr()) to ensure that all pending WAL is flushed,
fixing a publisher shutdown hang. However, if the last WAL record ends at
a page boundary, GetXLogInsertRecPtr() can return an LSN pointing past
the page header, which can cause XLogFlush() to report an error.
A similar issue previously existed in the GiST code. Commit b1f14c9672
introduced GetXLogInsertEndRecPtr(), which returns a safe WAL insertion end
location (returning the start of the page when the last record ends at a page
boundary), and updated the GiST code to use it with XLogFlush().
This commit fixes the issue by making the logical walsender use
XLogFlush(GetXLogInsertEndRecPtr()) when flushing pending WAL during shutdown.
Backpatch to all supported versions.
Reported-by: Andres Freund <andres@anarazel.de>
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/vzguaguldbcyfbyuq76qj7hx5qdr5kmh67gqkncyb2yhsygrdt@dfhcpteqifux
Backpatch-through: 14
The comment about ParallelWorkerNumbr in parallel.c says:
In parallel workers, it will be set to a value >= 0 and < the number
of workers before any user code is invoked; each parallel worker will
get a different parallel worker number.
However asserts in various places collecting instrumentation allowed
(ParallelWorkerNumber == num_workers). That would be a bug, as the value
is used as index into an array with num_workers entries.
Fixed by adjusting the asserts accordingly. Backpatch to all supported
versions.
Discussion: https://postgr.es/m/5db067a1-2cdf-4afb-a577-a04f30b69167@vondra.me
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Backpatch-through: 14
The function used GetXLogInsertRecPtr() to generate the fake LSN. Most
of the time this is the same as what XLogInsert() would return, and so
it works fine with the XLogFlush() call. But if the last record ends at
a page boundary, GetXLogInsertRecPtr() returns LSN pointing after the
page header. In such case XLogFlush() fails with errors like this:
ERROR: xlog flush request 0/01BD2018 is not satisfied --- flushed only to 0/01BD2000
Such failures are very hard to trigger, particularly outside aggressive
test scenarios.
Fixed by introducing GetXLogInsertEndRecPtr(), returning the correct LSN
without skipping the header. This is the same as GetXLogInsertRecPtr(),
except that it calls XLogBytePosToEndRecPtr().
Initial investigation by me, root cause identified by Andres Freund.
This is a long-standing bug in gistGetFakeLSN(), probably introduced by
c6b92041d3 in PG13. Backpatch to all supported versions.
Reported-by: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/vf4hbwrotvhbgcnknrqmfbqlu75oyjkmausvy66ic7x7vuhafx@e4rvwavtjswo
Backpatch-through: 14
The logic of xslt_process() has never considered the fact that
xsltSaveResultToString() would return NULL for an empty string (the
upstream code has always done so, with a string length of 0). This
would cause memcpy() to be called with a NULL pointer, something
forbidden by POSIX.
Like 46ab07ffda and similar fixes, this is backpatched down to all the
supported branches, with a test case to cover this scenario. An empty
string has been always returned in xml2 in this case, based on the
history of the module, so this is an old issue.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/c516a0d9-4406-47e3-9087-5ca5176ebcf9@gmail.com
Backpatch-through: 14
When I (rhaas) wrote the WAL summarizer code, I incorrectly believed
that XLOG_SMGR_TRUNCATE truncates all forks to the same length. In
fact, what other parts of the code do is compute the truncation length
for the FSM and VM forks from the truncation length used for the main
fork. But, because I was confused, I coded the WAL summarizer to set the
limit block for the VM fork to the same value as for the main fork.
(Incremental backup always copies FSM forks in full, so there is no
similar issue in that case.)
Doing that doesn't directly cause any data corruption, as far as I can
see. However, it does create a serious risk of consuming a large amount
of extra disk space, because pg_combinebackup's reconstruct.c believes
that the reconstructed file should always be at least as long as the
limit block value. We might want to be smarter about that at some point
in the future, because it's always safe to omit all-zeroes blocks at the
end of the last segment of a relation, and doing so could save disk
space, but the current algorithm will rarely waste enough disk space to
worry about unless we believe that a relation has been truncated to a
length much longer than its actual length on disk, which is exactly what
happens as a result of the problem mentioned in the previous paragraph.
To fix, create a new visibilitymap helper function and use it to include
the right limit block in the summary files. Incremental backups taken
with existing summary files will still have this issue, but this should
improve the situation going forward.
Diagnosed-by: Oleg Tkachenko <oatkachenko@gmail.com>
Diagnosed-by: Amul Sul <sulamul@gmail.com>
Discussion: http://postgr.es/m/CAAJ_b97PqG89hvPNJ8cGwmk94gJ9KOf_pLsowUyQGZgJY32o9g@mail.gmail.com
Discussion: http://postgr.es/m/6897DAF7-B699-41BF-A6FB-B818FCFFD585%40gmail.com
Backpatch-through: 17
Commit 2cd40adb85 added the IF NOT EXISTS option to ALTER TABLE ADD COLUMN.
This also enabled IF NOT EXISTS for ALTER FOREIGN TABLE ADD COLUMN,
but the ALTER FOREIGN TABLE documentation was not updated to mention it.
This commit updates the documentation to describe the IF NOT EXISTS option for
ALTER FOREIGN TABLE ADD COLUMN.
While updating that section, also this commit clarifies that the COLUMN keyword
is optional in ALTER FOREIGN TABLE ADD/DROP COLUMN. Previously, part of
the documentation could be read as if COLUMN were required.
This commit adds regression tests covering these ALTER FOREIGN TABLE syntaxes.
Backpatch to all supported versions.
Suggested-by: Fujii Masao <masao.fujii@gmail.com>
Author: Chao Li <lic@highgo.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwFk=rrhrwGwPtQxBesbT4DzSZ86Q3ftcwCu3AR5bOiXLw@mail.gmail.com
Backpatch-through: 14
When make_new_segment() creates an odd-sized segment, the pagemap was
only sized based on a number of usable_pages entries, forgetting that a
segment also contains metadata pages, and that the FreePageManager uses
absolute page indices that cover the entire segment. This
miscalculation could cause accesses to pagemap entries to be out of
bounds. During subsequent reuse of the allocated segment, allocations
landing on pages with indices higher than usable_pages could cause
out-of-bounds pagemap reads and/or writes. On write, 'span' pointers
are stored into the data area, corrupting the allocated objects. On
read (aka during a dsa_free), garbage is interpreted as a span pointer,
typically crashing the server in dsa_get_address().
The normal geometric path correctly sizes the pagemap for all pages in
the segment. The odd-sized path needs to do the same, but it works
forward from usable_pages rather than backward from total_size.
This commit fixes the sizing of the odd-sized case by adding pagemap
entries for the metadata pages after the initial metadata_bytes
calculation, using an integer ceiling division to compute the exact
number of additional entries needed in one go, avoiding any iteration in
the calculation.
An assertion is added in the code path for odd-sized segments, ensuring
that the pagemap includes the metadata area, and that the result is
appropriately sized.
This problem would show up depending on the size requested for the
allocation of a DSA segment. The reporter has noticed this issue when a
parallel hash join makes a DSA allocation large enough to trigger the
odd-sized segment path, but it could happen for anything that does a DSA
allocation.
A regression test is added to test_dsa, down to v17 where the test
module has been introduced. This adds a set of cheap tests to check the
problem, the new assertion being useful for this purpose. Sami has
proposed a test that took a longer time than what I have done here; the
test committed is faster and good enough to check the odd-sized
allocation path.
Author: Paul Bunn <paul.bunn@icloud.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/044401dcabac$fe432490$fac96db0$@icloud.com
Backpatch-through: 14
Previously, when logical replication was running, shutting down
the publisher could cause the logical walsender to enter a busy loop
and prevent the publisher from completing shutdown.
During shutdown, the logical walsender waits for all pending WAL
to be written out. However, some WAL records could remain unflushed,
causing the walsender to wait indefinitely.
The issue occurred because the walsender used XLogBackgroundFlush() to
flush pending WAL. This function does not guarantee that all WAL is written.
For example, WAL generated by a transaction without an assigned
transaction ID that aborts might not be flushed.
This commit fixes the bug by making the logical walsender call XLogFlush()
instead, ensuring that all pending WAL is written and preventing
the busy loop during shutdown.
Backpatch to all supported versions.
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAO6_Xqo3co3BuUVEVzkaBVw9LidBgeeQ_2hfxeLMQcXwovB3GQ@mail.gmail.com
Backpatch-through: 14
It looks like whoever wrote the astreamer (nee bbstreamer) code
thought that pg_log_error() is equivalent to elog(ERROR), but
it's not; it just prints a message. So all these places tried to
continue on after a compression or decompression error return,
with the inevitable result being garbage output and possibly
cascading error messages. We should use pg_fatal() instead.
These error conditions are probably pretty unlikely in practice,
which no doubt accounts for the lack of field complaints.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/1531718.1772644615@sss.pgh.pa.us
Backpatch-through: 15
This branch missed the IsolationUsesXactSnapshot() check. That led to EPQ on
repeatable read and serializable isolation levels. This commit fixes the
issue and provides a simple isolation check for that. Backpatch through v15
where MERGE statement was introduced.
Reported-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CAPpHfdvzZSaNYdj5ac-tYRi6MuuZnYHiUkZ3D-AoY-ny8v%2BS%2Bw%40mail.gmail.com
Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Backpatch-through: 15
In ALTER TABLE ... ADD/DROP COLUMN, the COLUMN keyword is optional. However,
part of the documentation could be read as if COLUMN were required, which may
mislead users about the command syntax.
This commit updates the ALTER TABLE documentation to clearly state that
COLUMN is optional for ADD and DROP.
Also this commit adds regression tests covering ALTER TABLE ... ADD/DROP
without the COLUMN keyword.
Backpatch to all supported versions.
Author: Chao Li <lic@highgo.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAEoWx2n6ShLMOnjOtf63TjjgGbgiTVT5OMsSOFmbjGb6Xue1Bw@mail.gmail.com
Backpatch-through: 14
This fixes a problem similar to ad8c86d22c. In this case, the test
could fail under the following circumstances:
- The primary is stopped with teardown_node(), meaning that it may not
be able to send all its WAL records to standby_1 and standby_2.
- If standby_2 receives more records than standby_1, attempting to
reconnect standby_2 to the promoted standby_1 would fail because of a
timeline fork.
This race condition is fixed with a simple trick: instead of tearing
down the primary, it is stopped cleanly so as all the WAL records of the
primary are received and flushed by both standby_1 and standby_2. Once
we do that, there is no need for a wait_for_catchup() before stopping
the node. The test wants to check that a timeline jump can be achieved
when reconnecting a standby to a promoted standby in the same cluster,
hence an immediate stop of the primary is not required.
This failure is harder to reach than the previous instability of
009_twophase, still the buildfarm has been able to detect this failure
at least once. I have tried Alexander Lakhin's test trick with the
bgwriter and very aggressive standby snapshots, but I could not
reproduce it directly. It is reachable, as the buildfarm has proved.
Backpatch down to all supported branches, and this problem can lead to
spurious failures in the buildfarm.
Discussion: https://postgr.es/m/493401a8-063f-436a-8287-a235d9e065fc@gmail.com
Backpatch-through: 14
The code path in astreamer_lz4_decompressor_content() that updated
the output pointers when the output buffer isn't full was wrong.
It advanced next_out by bytes_written, which could include previous
decompression output not just that of the current cycle. The
correct amount to advance is out_size. While at it, make the
output pointer updates look more like the input pointer updates.
This bug is pretty hard to reach, as it requires consecutive
compression frames that are too small to fill the output buffer.
pg_dump could have produced such data before 66ec01dc4, but
I'm unsure whether any files we use astreamer with would be
likely to contain problematic data.
Author: Chao Li <lic@highgo.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/0594CC79-1544-45DD-8AA4-26270DE777A7@gmail.com
Backpatch-through: 15
The phase of the test where we want to check that 2PC transactions
prepared on a primary can be committed on a promoted standby relied on
an immediate stop of the primary. This logic has a race condition: it
could be possible that some records (most likely standby snapshot
records) are generated on the primary before it finishes its shutdown,
without the promoted standby know about them. When the primary is
recycled as new standby, the test could fail because of a timeline fork
as an effect of these extra records.
This fix takes care of the instability by doing a clean stop of the
primary instead of a teardown (aka immediate stop), so as all records
generated on the primary are sent to the promoted standby and flushed
there. There is no need for a teardown of the primary in this test
scenario: the commit of 2PC transactions on a promoted standby do not
care about the state of the primary, only of the standby.
This race is very hard to hit in practice, even slow buildfarm members
like skink have a very low rate of reproduction. Alexander Lakhin has
come up with a recipe to improve the reproduction rate a lot:
- Enable -DWAL_DEBUG.
- Patch the bgwriter so as standby snapshots are generated every
milliseconds.
- Run 009_twophase tests under heavy parallelism.
With this method, the failure appears after a couple of iterations.
With the fix in place, I have been able to run more than 50 iterations
of the parallel test sequence, without seeing a failure.
Issue introduced in 30820982b2, due to a copy-pasto coming from the
surrounding tests. Thanks also to Hayato Kuroda for digging into the
details of the failure. He has proposed a fix different than the one of
this commit. Unfortunately, it relied on injection points, feature only
available in v17. The solution of this commit is simpler, and can be
applied to v14~v16.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/b0102688-6d6c-c86a-db79-e0e91d245b1a@gmail.com
Backpatch-through: 14
Clarify the documentation of COMMENT ON to state that specifying an empty
string is treated as NULL, meaning that the comment is removed.
This makes the behavior explicit and avoids possible confusion about how
empty strings are handled.
Also adds regress test cases that use empty string to remove a comment.
Backpatch to all supported versions.
Author: Chao Li <lic@highgo.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Shengbin Zhao <zshengbin91@gmail.com>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Reviewed-by: zhangqiang <zhang_qiang81@163.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/26476097-B1C1-4BA8-AA92-0AD0B8EC7190@gmail.com
Backpatch-through: 14
Presently, the GUC check hook for basic_archive.archive_directory
checks that the specified directory exists. Consequently, if the
directory does not exist at server startup, archiving will be stuck
indefinitely, even if it appears later. To fix, remove this check
from the hook so that archiving will resume automatically once the
directory is present. basic_archive must already be prepared to
deal with the directory disappearing at any time, so no additional
special handling is required.
Reported-by: Олег Самойлов <splarv@ya.ru>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Sergei Kornilov <sk@zsrv.org>
Discussion: https://postgr.es/m/73271769675212%40mail.yandex.ru
Backpatch-through: 15
Commit ab355e3a88 changed how the OldestMemberMXactId array is
indexed. It's no longer indexed by synthetic dummyBackendId, but with
ProcNumber. The PGPROC entries for prepared xacts come after auxiliary
processes in the allProcs array, which rendered the calculation for
MaxOldestSlot and the indexes into the array incorrect. (The
OldestVisibleMXactId array is not used for prepared xacts, and thus
never accessed with ProcNumber's greater than MaxBackends, so this
only affects the OldestMemberMXactId array.)
As a result, a prepared xact would store its value past the end of the
OldestMemberMXactId array, overflowing into the OldestVisibleMXactId
array. That could cause a transaction's row lock to appear invisible
to other backends, or other such visibility issues. With a very small
max_connections setting, the store could even go beyond the
OldestVisibleMXactId array, stomping over the first element in the
BufferDescriptor array.
To fix, calculate the array sizes more precisely, and introduce helper
functions to calculate the array indexes correctly.
Author: Yura Sokolov <y.sokolov@postgrespro.ru>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/7acc94b0-ea82-4657-b1b0-77842cb7a60c@postgrespro.ru
Backpatch-through: 17
In commits 29d75b25b et al, I made pg_dumpall's dumpRoleMembership
logic treat a dangling grantor OID the same as dangling role and
member OIDs: print a warning and skip emitting the GRANT. This wasn't
terribly well thought out; instead, we should handle the case by
emitting the GRANT without the GRANTED BY clause. When the source
database is pre-v16, such cases are somewhat expected because those
versions didn't prevent dropping the grantor role; so don't even
print a warning that we did this. (This change therefore restores
pg_dumpall's pre-v16 behavior for these cases.) The case is not
expected in >= v16, so then we do print a warning, but soldiering on
with no GRANTED BY clause still seems like a reasonable strategy.
Per complaint from Robert Haas that we were now dropping GRANTs
altogether in easily-reachable scenarios.
Reported-by: Robert Haas <robertmhaas@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA+TgmoauoiW4ydDhdrseg+DD4Kwha=+TSZp18BrJeHKx3o1Fdw@mail.gmail.com
Backpatch-through: 16
This commit adds a new test module called "test_custom_types", that can
be used to stress code paths related to custom data type
implementations.
Currently, this is used as a test suite to validate the set of fixes
done in 3b7a6fa157, that requires some typanalyze callbacks that can
force very specific backend behaviors, as of:
- typanalyze callback that returns "false" as status, to mark a failure
in computing statistics.
- typanalyze callback that returns "true" but let's the backend know
that no interesting stats could be computed, with stats_valid set to
"false".
This could be extended more in the future if more problems are found.
For simplicity, the module uses a fake int4 data type, that requires a
btree operator class to be usable with extended statistics. The type is
created by the extension, and its properties are altered in the test.
Like 3b7a6fa157, this module is backpatched down to v14, for coverage
purposes.
Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aaDrJsE1I5mrE-QF@paquier.xyz
Backpatch-through: 14
This commit addresses two defects regarding extended statistics on
expressions:
- When building extended statistics in lookup_var_attr_stats(), the call
to examine_attribute() did not account for the possibility of a NULL
return value. This can happen depending on the behavior of a typanalyze
callback — for example, if the callback returns false, if no rows are
sampled, or if no statistics are computed. In such cases, the code
attempted to build MCV, dependency, and ndistinct statistics using a
NULL pointer, incorrectly assuming valid statistics were available,
which could lead to a server crash.
- When loading extended statistics for expressions,
statext_expressions_load() did not account for NULL entries in the
pg_statistic array storing expression statistics. Such NULL entries can
be generated when statistics collection fails for an expression, as may
occur during the final step of serialize_expr_stats(). An extended
statistics object defining N expressions requires N corresponding
elements in the pg_statistic array stored for the expressions, and some
of these elements can be NULL. This situation is reachable when a
typanalyze callback returns true, but sets stats_valid to indicate that
no useful statistics could be computed.
While these scenarios cannot occur with in-core typanalyze callbacks, as
far as I have analyzed, they can be triggered by custom data types with
custom typanalyze implementations, at least.
No tests are added in this commit. A follow-up commit will introduce a
test module that can be extended to cover similar edge cases if
additional issues are discovered. This takes care of the core of the
problem.
Attribute and relation statistics already offer similar protections:
- ANALYZE detects and skips the build of invalid statistics.
- Invalid catalog data is handled defensively when loading statistics.
This issue exists since the support for extended statistics on
expressions has been added, down to v14 as of a4d75c86bf. Backpatch
to all supported stable branches.
Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aaDrJsE1I5mrE-QF@paquier.xyz
Backpatch-through: 14
Commit 84d5efa7e3 missed some multibyte issues caused by short-circuit
logic in the callers. The callers assumed that if the predicate string
is longer than the label string, then it couldn't possibly be a match,
but it can be when using case-insensitive matching (LVAR_INCASE) if
casefolding changes the byte length.
Fix by refactoring to get rid of the short-circuit logic as well as
the function pointer, and consolidate the logic in a replacement
function ltree_label_match().
Discussion: https://postgr.es/m/02c6ef6cf56a5013ede61ad03c7a26affd27d449.camel@j-davis.com
Backpatch-through: 14
Otherwise, this would break if using C and C++ compilers from
different families and they understand different options. It already
used the right flags for compiling, this is only for linking. Also,
the meson setup already did this correctly.
Back-patch of v18 commit 365b5a345 into older supported branches.
At the time we were only aware of trouble in v18, but as shown
by buildfarm member siren, older branches can hit the problem too.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/228700.1722717983@sss.pgh.pa.us
Discussion: https://postgr.es/m/3109540.1771698685@sss.pgh.pa.us
Backpatch-through: 14-17
Settings that ran the new test euc_kr.sql to completion would fail these
older src/pl tests. Use alternative expected outputs, for which psql
\gset and \if have reduced the maintenance burden. This fixes
"LANG=ko_KR.euckr LC_MESSAGES=C make check-world". (LC_MESSAGES=C fixes
IO::Pty usage in tests 010_tab_completion and 001_password.) That file
is new in commit c67bef3f32. Back-patch
to v14, like that commit.
Discussion: https://postgr.es/m/20260217184758.da.noahmisch@microsoft.com
Backpatch-through: 14
The documentation for the INCLUDING COMMENTS option of the LIKE clause
in CREATE TABLE was inaccurate and incomplete. It stated that comments for
copied columns, constraints, and indexes are copied, but regarding comments
on constraints in reality only comments on CHECK and NOT NULL constraints
are copied; comments on other constraints (such as primary keys) are not.
In addition, comments on extended statistics are copied, but this was not
documented.
The CREATE FOREIGN TABLE documentation had a similar omission: comments
on extended statistics are also copied, but this was not mentioned.
This commit updates the documentation to clarify the actual behavior.
The CREATE TABLE reference now specifies that comments on copied columns,
CHECK constraints, NOT NULL constraints, indexes, and extended statistics are
copied. The CREATE FOREIGN TABLE reference now notes that comments on
extended statistics are copied as well.
Backpatch to all supported versions. Documentation updates related to
CREATE FOREIGN TABLE LIKE and NOT NULL constraint comment copying are
not applied to v17 and earlier, since those features were introduced in v18.
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwHSOSGcaYDvHF8EYCUCfGPjbRwGFsJ23cx5KbJ1X6JouQ@mail.gmail.com
Backpatch-through: 14
Previously, when one process woke another that was waiting on a lock,
ProcWakeup() incorrectly cleared its own waitStart field (i.e.,
MyProc->waitStart) instead of that of the process being awakened.
As a result, the awakened process retained a stale lock-wait start timestamp.
This did not cause user-visible issues. pg_locks.waitstart was reported as
NULL for the awakened process (i.e., when pg_locks.granted is true),
regardless of the waitStart value.
This bug was introduced by commit 46d6e5f567.
This commit fixes this by resetting the waitStart field of the process
being awakened in ProcWakeup().
Backpatch to all supported branches.
Reported-by: Chao Li <li.evan.chao@gmail.com>
Author: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: ji xu <thanksgreed@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/537BD852-EC61-4D25-AB55-BE8BE46D07D7@gmail.com
Backpatch-through: 14
Although clang claims to be compatible with gcc's printf format
archetypes, this appears to be a falsehood: it likes __syslog__
(which gcc does not, on most platforms) and doesn't accept
gnu_printf. This means that if you try to use gcc with clang++
or clang with g++, you get compiler warnings when compiling
printf-like calls in our C++ code. This has been true for quite
awhile, but it's gotten more annoying with the recent appearance
of several buildfarm members that are configured like this.
To fix, run separate probes for the format archetype to use with the
C and C++ compilers, and conditionally define PG_PRINTF_ATTRIBUTE
depending on __cplusplus.
(We could alternatively insist that you not mix-and-match C and
C++ compilers; but if the case works otherwise, this is a poor
reason to insist on that.)
This commit back-patches 0909380e4 into supported branches.
Discussion: https://postgr.es/m/986485.1764825548@sss.pgh.pa.us
Discussion: https://postgr.es/m/3988414.1771950285@sss.pgh.pa.us
Backpatch-through: 14-18
Newest versions of gcc+glibc are able to detect cases where code
implicitly casts away const by assigning the result of strchr() or
a similar function applied to a "const char *" value to a target
variable that's just "char *". This of course creates a hazard of
not getting a compiler warning about scribbling on a string one was
not supposed to, so fixing up such cases is good.
This patch fixes a dozen or so places where we were doing that.
Most are trivial additions of "const" to the target variable,
since no actually-hazardous change was occurring.
Thanks to Bertrand Drouvot for finding a couple more spots than
I had.
This commit back-patches relevant portions of 8f1791c61 and
9f7565c6c into supported branches. However, there are two
places in ecpg (in v18 only) where a proper fix is more
complicated than seems appropriate for a back-patch. I opted
to silence those two warnings by adding casts.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/1324889.1764886170@sss.pgh.pa.us
Discussion: https://postgr.es/m/3988414.1771950285@sss.pgh.pa.us
Backpatch-through: 14-18
The test added by commit 4b760a181 assumed that a table's physical
row order would be predictable after an UPDATE. But a non-heap table
AM might produce some other order. Even with heap AM, the assumption
seems risky; compare a3fd53bab for instance. Adding an ORDER BY is
cheap insurance and doesn't break any goal of the test.
Author: Pavel Borisov <pashkin.elfe@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CALT9ZEHcE6tpvumScYPO6pGk_ASjTjWojLkodHnk33dvRPHXVw@mail.gmail.com
Backpatch-through: 14
The grease patch in 4966bd3ed found its first problem: prior to the
February 2018 patch releases, no server knew how to negotiate protocol
versions, so pg_upgrade needs to take that into account when speaking to
those older servers.
This will be true even after the grease feature is reverted; we don't
need anyone to trip over this again in the future. Backpatch so that all
supported versions of pg_upgrade can gracefully handle an update to the
default protocol version. (This is needed for any distributions that
link older binaries against newer libpqs, such as Debian.) Branches
prior to 18 need an additional version check, for the existence of
max_protocol_version.
Per buildfarm member crake.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAOYmi%2B%3D4QhCjssfNEoZVK8LPtWxnfkwT5p-PAeoxtG9gpNjqOQ%40mail.gmail.com
Backpatch-through: 14