postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-07-12 11:05:31 -04:00

Author	SHA1	Message	Date
David Rowley	5e02f92d9e	Doc: add missing punctuation Author: Daisuke Higuchi <higuchi.daisuke11@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/CAEVT6c-yWYstu76YZ7VOxmij2XA8vrOEvens08QLmKHTDjEPBw@mail.gmail.com Backpatch-through: 14	2026-01-04 21:14:11 +13:00
David Rowley	54f82c4aae	Fix selectivity estimation integer overflow in contrib/intarray This fixes a poorly written integer comparison function which was performing subtraction in an attempt to return a negative value when a < b and a positive value when a > b, and 0 when the values were equal. Unfortunately that didn't always work correctly due to two's complement having the INT_MIN 1 further from zero than INT_MAX. This could result in an overflow and cause the comparison function to return an incorrect result, which would result in the binary search failing to find the value being searched for. This could cause poor selectivity estimates when the statistics stored the value of INT_MAX (2147483647) and the value being searched for was large enough to result in the binary search doing a comparison with that INT_MAX value. Author: Chao Li <li.evan.chao@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAEoWx2ng1Ot5LoKbVU-Dh---dFTUZWJRH8wv2chBu29fnNDMaQ@mail.gmail.com Backpatch-through: 14	2026-01-04 20:34:01 +13:00
Bruce Momjian	3e77e944e7	Update copyright for 2026 Backpatch-through: 14	2026-01-01 13:24:10 -05:00
Thomas Munro	130b001c15	jit: Fix jit_profiling_support when unavailable. jit_profiling_support=true captures profile data for Linux perf. On other platforms, LLVMCreatePerfJITEventListener() returns NULL and the attempt to register the listener would crash. Fix by ignoring the setting in that case. The documentation already says that it only has an effect if perf support is present, and we already did the same for older LLVM versions that lacked support. No field reports, unsurprisingly for an obscure developer-oriented setting. Noticed in passing while working on commit `1a28b4b4`. Backpatch-through: 14 Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA%2BhUKGJgB6gvrdDohgwLfCwzVQm%3DVMtb9m0vzQn%3DCwWn-kwG9w%40mail.gmail.com	2025-12-31 14:54:10 +13:00
Masahiko Sawada	8214667226	Fix a race condition in updating procArray->replication_slot_xmin. Previously, ReplicationSlotsComputeRequiredXmin() computed the oldest xmin across all slots without holding ProcArrayLock (when already_locked is false), acquiring the lock just before updating the replication slot xmin. This could lead to a race condition: if a backend created a new slot and updates the global replication slot xmin, another backend concurrently running ReplicationSlotsComputeRequiredXmin() could overwrite that update with an invalid or stale value. This happens because the concurrent backend might have computed the aggregate xmin before the new slot was accounted for, but applied the update after the new slot had already updated the global value. In the reported failure, a walsender for an apply worker computed InvalidTransactionId as the oldest xmin and overwrote a valid replication slot xmin value computed by a walsender for a tablesync worker. Consequently, the tablesync worker computed a transaction ID via GetOldestSafeDecodingTransactionId() effectively without considering the replication slot xmin. This led to the error "cannot build an initial slot snapshot as oldest safe xid %u follows snapshot's xmin %u", which was an assertion failure prior to commit `240e0dbacd`. To fix this, we acquire ReplicationSlotControlLock in exclusive mode during slot creation to perform the initial update of the slot xmin. In ReplicationSlotsComputeRequiredXmin(), we hold ReplicationSlotControlLock in shared mode until the global slot xmin is updated in ProcArraySetReplicationSlotXmin(). This prevents concurrent computations and updates of the global xmin by other backends during the initial slot xmin update process, while still permitting concurrent calls to ReplicationSlotsComputeRequiredXmin(). Backpatch to all supported versions. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Pradeep Kumar <spradeepkumar29@gmail.com> Reviewed-by: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAA4eK1L8wYcyTPxNzPGkhuO52WBGoOZbT0A73Le=ZUWYAYmdfw@mail.gmail.com Backpatch-through: 14	2025-12-30 10:56:23 -08:00
Thomas Munro	dfb9ff5904	jit: Remove -Wno-deprecated-declarations in 18+. REL_18_STABLE and master have commit `ee485912`, so they always use the newer LLVM opaque pointer functions. Drop -Wno-deprecated-declarations (commit `a56e7b660`) for code under jit/llvm in those branches, to catch any new deprecation warnings that arrive in future version of LLVM. Older branches continued to use functions marked deprecated in LLVM 14 and 15 (ie switched to the newer functions only for LLVM 16+), as a precaution against unforeseen compatibility problems with bitcode already shipped. In those branches, the comment about warning suppression is updated to explain that situation better. In theory we could suppress warnings only for LLVM 14 and 15 specifically, but that isn't done here. Backpatch-through: 14 Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1407185.1766682319%40sss.pgh.pa.us	2025-12-30 14:32:30 +13:00
Thomas Munro	4b9ce1ef60	ci: Test Windows + Mkvcbuild.pm in REL_16_STABLE. * REL_15_STABLE introduced CI and tested Windows with Mkvcbuild.pm. * REL_16_STABLE introduced Meson and switched Windows CI to that. * REL_17_STABLE dropped Mkvcbuild.pm. That left a blind spot when testing Makefile changes back-patched into 16. Mkvcbuild.pm scrapes Makefiles and might break, so it's useful to be able to check that before hitting "hamerkop" in the build farm. Copy REL_15_STABLE's Windows task into REL_16_STABLE as a separate task, with a few small adjustments to match later task definition style. Discussion: https://postgr.es/m/CA%2BhUKG%2B-d0OyLMdMiZ%2BFtj2hhZXT%2B0HOyHfrPBecE_vZzh9rRg%40mail.gmail.com	2025-12-29 15:56:26 +13:00
Thomas Munro	80e8ec772b	Fix Mkvcbuild.pm builds of test_cloexec.c. Mkvcbuild.pm scrapes Makefile contents, but couldn't understand the change made by commit `bec2a0aa`. Revealed by BF animal hamerkop in branch REL_16_STABLE. 1. It used += instead of =, which didn't match the pattern that Mkvcbuild.pm looks for. Drop the +. 2. Mkvcbuild.pm doesn't link PROGRAM executables with libpgport. Apply a local workaround to REL_16_STABLE only (later branches dropped Mkvcbuild.pm). Backpatch-through: 16 Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/175163.1766357334%40sss.pgh.pa.us	2025-12-29 15:31:39 +13:00
Michael Paquier	c48829ed83	Fix pg_stat_get_backend_activity() to use multi-byte truncated result pg_stat_get_backend_activity() calls pgstat_clip_activity() to ensure that the reported query string is correctly truncated when it finishes with an incomplete multi-byte sequence. However, the result returned by the function was not what pgstat_clip_activity() generated, but the non-truncated, original, contents from PgBackendStatus.st_activity_raw. Oversight in `54b6cd589a`, so backpatch all the way down. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAEoWx2mDzwc48q2EK9tSXS6iJMJ35wvxNQnHX+rXjy5VgLvJQw@mail.gmail.com Backpatch-through: 14	2025-12-27 17:23:54 +09:00
Michael Paquier	82a923bc61	doc: Remove duplicate word in ECPG description Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/d6d6a800f8b503cd78d5f4fa721198e40eec1677.camel@cybertec.at Backpatch-through: 14	2025-12-26 15:26:06 +09:00
Amit Kapila	63a65adf4d	Don't advance origin during apply failure. The logical replication parallel apply worker could incorrectly advance the origin progress during an error or failed apply. This behavior risks transaction loss because such transactions will not be resent by the server. Commit `3f28b2fcac` addressed a similar issue for both the apply worker and the table sync worker by registering a before_shmem_exit callback to reset origin information. This prevents the worker from advancing the origin during transaction abortion on shutdown. This patch applies the same fix to the parallel apply worker, ensuring consistent behavior across all worker types. As with `3f28b2fcac`, we are backpatching through version 16, since parallel apply mode was introduced there and the issue only occurs when changes are applied before the transaction end record (COMMIT or ABORT) is received. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 16 Discussion: https://postgr.es/m/TY4PR01MB169078771FB31B395AB496A6B94B4A@TY4PR01MB16907.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/TYAPR01MB5692FAC23BE40C69DA8ED4AFF5B92@TYAPR01MB5692.jpnprd01.prod.outlook.com	2025-12-24 04:19:57 +00:00
Heikki Linnakangas	7efef18ffc	Fix bug in following update chain when locking a heap tuple After waiting for a concurrent updater to finish, heap_lock_tuple() followed the update chain to lock all tuple versions. However, when stepping from the initial tuple to the next one, it failed to check that the next tuple's XMIN matches the initial tuple's XMAX. That's an important check whenever following an update chain, and the recursive part that follows the chain did it, but the initial step missed it. Without the check, if the updating transaction aborts, the updated tuple is vacuumed away and replaced by an unrelated tuple, the unrelated tuple might get incorrectly locked. Author: Jasper Smit <jasper.smit@servicenow.com> Discussion: https://www.postgresql.org/message-id/CAOG+RQ74x0q=kgBBQ=mezuvOeZBfSxM1qu_o0V28bwDz3dHxLw@mail.gmail.com Backpatch-through: 14	2025-12-23 13:37:27 +02:00
Tom Lane	ebd5696166	Add missing .gitignore for src/test/modules/test_cloexec.	2025-12-23 15:00:16 +09:00
Michael Paquier	e22e9ab0cd	Fix orphaned origin in shared memory after DROP SUBSCRIPTION Since `ce0fdbfe97`, a replication slot and an origin are created by each tablesync worker, whose information is stored in both a catalog and shared memory (once the origin is set up in the latter case). The transaction where the origin is created is the same as the one that runs the initial COPY, with the catalog state of the origin becoming visible for other sessions only once the COPY transaction has committed. The catalog state is coupled with a state in shared memory, initialized at the same time as the origin created in the catalogs. Note that the transaction doing the initial data sync can take a long time, time that depends on the amount of data to transfer from a publication node to its subscriber node. Now, when a DROP SUBSCRIPTION is executed, all its workers are stopped with the origins removed. The removal of each origin relies on a catalog lookup. A worker still running the initial COPY would fail its transaction, with the catalog state of the origin rolled back while the shared memory state remains around. The session running the DROP SUBSCRIPTION should be in charge of cleaning up the catalog and the shared memory state, but as there is no data in the catalogs the shared memory state is not removed. This issue would leave orphaned origin data in shared memory, leading to a confusing state as it would still show up in pg_replication_origin_status. Note that this shared memory data is sticky, being flushed on disk in replorigin_checkpoint at checkpoint. This prevents other origins from reusing a slot position in the shared memory data. To address this problem, the commit moves the creation of the origin at the end of the transaction that precedes the one executing the initial COPY, making the origin immediately visible in the catalogs for other sessions, giving DROP SUBSCRIPTION a way to know about it. A different solution would have been to clean up the shared memory state using an abort callback within the tablesync worker. The solution of this commit is more consistent with the apply worker that creates an origin in a short transaction. A test is added in the subscription test 004_sync.pl, which was able to display the problem. The test fails when this commit is reverted. Reported-by: Tenglong Gu <brucegu@amazon.com> Reported-by: Daisuke Higuchi <higudai@amazon.com> Analyzed-by: Michael Paquier <michael@paquier.xyz> Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/aUTekQTg4OYnw-Co@paquier.xyz Backpatch-through: 14	2025-12-23 14:32:22 +09:00
Thomas Munro	b1316b78f8	Fix printf format string warning on MinGW. This is a back-patch of `1319997d` to branches 14-17 to fix an old warning about a printf type mismatch on MinGW, in anticipation of a potential expansion of the scope of CI's CompilerWarnings checks. Though CI began in 15, BF animal fairwren also shows the warning in 14, so we might as well fix that too. Original commit message (except for new "Backpatch-through" tag): Commit `517bf2d91` changed a printf format string to placate MinGW, which at the time warned about "%lld". Current MinGW is now warning about the replacement "%I64d". Reverting the change clears the warning on the MinGW CI task, and hopefully it will clear it on build farm animal fairywren too. Backpatch-through: 14-17 Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/TYAPR01MB5866A71B744BE01B3BF71791F5AEA%40TYAPR01MB5866.jpnprd01.prod.outlook.com	2025-12-21 21:14:54 +13:00
Thomas Munro	0666ccc16c	Clean up test_cloexec.c and Makefile. An unused variable caused a compiler warning on BF animal fairywren, an snprintf() call was redundant, and some buffer sizes were inconsistent. Per code review from Tom Lane. The Makefile's test ifeq ($(PORTNAME), win32) never succeeded due to a circularity, so only Meson builds were actually compiling the new test code, partially explaining why CI didn't tell us about the warning sooner (the other problem being that CompilerWarnings only makes world-bin, a problem for another commit). Simplify. Backpatch-through: 16, like commit `c507ba55` Author: Bryan Green <dbryan.green@gmail.com> Co-authored-by: Thomas Munro <tmunro@gmail.com> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1086088.1765593851%40sss.pgh.pa.us	2025-12-21 17:36:04 +13:00
Fujii Masao	3853f61681	Add guard to prevent recursive memory context logging. Previously, if memory context logging was triggered repeatedly and rapidly while a previous request was still being processed, it could result in recursive calls to ProcessLogMemoryContextInterrupt(). This could lead to infinite recursion and potentially crash the process. This commit adds a guard to prevent such recursion. If ProcessLogMemoryContextInterrupt() is already in progress and logging memory contexts, subsequent calls will exit immediately, avoiding unintended recursive calls. While this scenario is unlikely in practice, it's not impossible. This change adds a safety check to prevent such failures. Back-patch to v14, where memory context logging was introduced. Reported-by: Robert Haas <robertmhaas@gmail.com> Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Artem Gavrilov <artem.gavrilov@percona.com> Discussion: https://postgr.es/m/CA+TgmoZMrv32tbNRrFTvF9iWLnTGqbhYSLVcrHGuwZvCtph0NA@mail.gmail.com Backpatch-through: 14	2025-12-19 12:08:20 +09:00
Heikki Linnakangas	a5277700e4	Do not emit WAL for unlogged BRIN indexes Operations on unlogged relations should not be WAL-logged. The brin_initialize_empty_new_buffer() function didn't get the memo. The function is only called when a concurrent update to a brin page uses up space that we're just about to insert to, which makes it pretty hard to hit. If you do manage to hit it, a full-page WAL record is erroneously emitted for the unlogged index. If you then crash, crash recovery will fail on that record with an error like this: FATAL: could not create file "base/5/32819": File exists Author: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/CALdSSPhpZXVFnWjwEBNcySx_vXtXHwB2g99gE6rK0uRJm-3GgQ@mail.gmail.com Backpatch-through: 14	2025-12-18 15:09:21 +02:00
Noah Misch	2655d2e478	Update .abi-compliance-history for PrepareToInvalidateCacheTuple(). Commit `0f69beddea` (v17) anticipated this: [C] 'function void PrepareToInvalidateCacheTuple(Relation, HeapTuple, HeapTuple, void (int, typedef uint32, typedef Oid))' has some sub-type changes: parameter 5 of type 'void' was added parameter 4 of type 'void (int, typedef uint32, typedef Oid)' changed: pointer type changed from: 'void (int, typedef uint32, typedef Oid)' to: 'void (int, typedef uint32, typedef Oid, void)' Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com Backpatch-through: 14-17	2025-12-17 09:48:56 -08:00
Noah Misch	27e4fad980	Assert lack of hazardous buffer locks before possible catalog read. Commit `0bada39c83` fixed a bug of this kind, which existed in all branches for six days before detection. While the probability of reaching the trouble was low, the disruption was extreme. No new backends could start, and service restoration needed an immediate shutdown. Hence, add this to catch the next bug like it. The new check in RelationIdGetRelation() suffices to make autovacuum detect the bug in commit `243e9b40f1` that led to commit `0bada39`. This also checks in a number of similar places. It replaces each Assert(IsTransactionState()) that pertained to a conditional catalog read. Back-patch to v14 - v17. This a back-patch of commit `f4ece891fc` (from before v18 branched) to all supported branches, to accompany the back-patch of commits `243e9b4` and `0bada39`. For catalog indexes, the bttextcmp() behavior that motivated IsCatalogTextUniqueIndexOid() was v18-specific. Hence, this back-patch doesn't need that or its correction from commit `4a4ee0c2c1`. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/20250410191830.0e.nmisch@google.com Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com Backpatch-through: 14-17	2025-12-16 16:13:55 -08:00
Noah Misch	720e9304fa	WAL-log inplace update before revealing it to other sessions. A buffer lock won't stop a reader having already checked tuple visibility. If a vac_update_datfrozenid() and then a crash happened during inplace update of a relfrozenxid value, datfrozenxid could overtake relfrozenxid. That could lead to "could not access status of transaction" errors. Back-patch to v14 - v17. This is a back-patch of commits: - `8e7e672cda` (main change, on master, before v18 branched) - `8180136652` (defect fix, on master, before v18 branched) It reverses commit `bc6bad8857`, my revert of the original back-patch. In v14, this also back-patches the assertion removal from commit `7fcf2faf9c`. Discussion: https://postgr.es/m/20240620012908.92.nmisch@google.com Backpatch-through: 14-17	2025-12-16 16:13:55 -08:00
Noah Misch	1d7b02711f	For inplace update, send nontransactional invalidations. The inplace update survives ROLLBACK. The inval didn't, so another backend's DDL could then update the row without incorporating the inplace update. In the test this fixes, a mix of CREATE INDEX and ALTER TABLE resulted in a table with an index, yet relhasindex=f. That is a source of index corruption. Back-patch to v14 - v17. This is a back-patch of commits: - `243e9b40f1` (main change, on master, before v18 branched) - `0bada39c83` (defect fix, on master, before v18 branched) - `bae8ca82fd` (cosmetics from post-commit review, on REL_18_STABLE) It reverses commit `c1099dd745`, my revert of the original back-patch of `243e9b4`. This back-patch omits the non-comment heap_decode() changes. I find those changes removed harmless code that was last necessary in v13. See discussion thread for details. The back branches aren't the place to remove such code. Like the original back-patch, this doesn't change WAL, because these branches use end-of-recovery SIResetAll(). All branches change the ABI of extern function PrepareToInvalidateCacheTuple(). No PGXN extension calls that, and there's no apparent use case in extensions. Expect ".abi-compliance-history" edits to follow. Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Surya Poondla <s_poondla@apple.com> Reviewed-by: Ilyasov Ian <ianilyasov@outlook.com> Reviewed-by: Nitin Motiani <nitinmotiani@google.com> (in earlier versions) Reviewed-by: Andres Freund <andres@anarazel.de> (in earlier versions) Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com Backpatch-through: 14-17	2025-12-16 16:13:55 -08:00
Michael Paquier	ed75434c45	Reorder two functions in inval.c This file separates public and static functions with a separator comment, but two routines were not defined in a location reflecting that, so reorder them. Back-patch commit `c2bdd2c5b1` to v15 - v16. This avoids merge conflicts in the next commit, which modifies a function this moved. Exclude v14, which is so different that the merge conflict savings would be immaterial. Author: Aleksander Alekseev <aleksander@timescale.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAJ7c6TMX2dd0g91UKvcC+CVygKQYJkKJq1+ZzT4rOK42+b53=w@mail.gmail.com Backpatch-through: 15-16	2025-12-16 16:13:55 -08:00
Jeff Davis	b80227c0a5	Fix multibyte issue in ltree_strncasecmp(). Previously, the API for ltree_strncasecmp() took two inputs but only one length (that of the smaller input). It truncated the larger input to that length, but that could break a multibyte sequence. Change the API to be a check for prefix equality (possibly case-insensitive) instead, which is all that's needed by the callers. Also, provide the lengths of both inputs. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/5f65b85740197ba6249ea507cddf609f84a6188b.camel%40j-davis.com Backpatch-through: 14	2025-12-16 10:36:48 -08:00
Robert Haas	12c2f843cd	Switch memory contexts in ReinitializeParallelDSM. We already do this in CreateParallelContext, InitializeParallelDSM, and LaunchParallelWorkers. I suspect the reason why the matching logic was omitted from ReinitializeParallelDSM is that I failed to realize that any memory allocation was happening here -- but shm_mq_attach does allocate, which could result in a shm_mq_handle being allocated in a shorter-lived context than the ParallelContext which points to it. That could result in a crash if the shorter-lived context is freed before the parallel context is destroyed. As far as I am currently aware, there is no way to reach a crash using only code that is present in core PostgreSQL, but extensions could potentially trip over this. Fixing this in the back-branches appears low-risk, so back-patch to all supported versions. Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Co-authored-by: Jeevan Chalke <jeevan.chalke@enterprisedb.com> Backpatch-through: 14 Discussion: http://postgr.es/m/CAKZiRmwfVripa3FGo06=5D1EddpsLu9JY2iJOTgbsxUQ339ogQ@mail.gmail.com	2025-12-16 10:59:02 -05:00
Michael Paquier	1aa57e9ed5	Fail recovery when missing redo checkpoint record without backup_label This commit adds an extra check at the beginning of recovery to ensure that the redo record of a checkpoint exists before attempting WAL replay, logging a PANIC if the redo record referenced by the checkpoint record could not be found. This is the same level of failure as when a checkpoint record is missing. This check is added when a cluster is started without a backup_label, after retrieving its checkpoint record. The redo LSN used for the check is retrieved from the checkpoint record successfully read. In the case where a backup_label exists, the startup process already fails if the redo record cannot be found after reading a checkpoint record at the beginning of recovery. Previously, the presence of the redo record was not checked. If the redo and checkpoint records were located on different WAL segments, it would be possible to miss a entire range of WAL records that should have been replayed but were just ignored. The consequences of missing the redo record depend on the version dealt with, these becoming worse the older the version used: - On HEAD, v18 and v17, recovery fails with a pointer dereference at the beginning of the redo loop, as the redo record is expected but cannot be found. These versions are good students, because we detect a failure before doing anything, even if the failure is misleading in the shape of a segmentation fault, giving no information that the redo record is missing. - In v16 and v15, problems show at the end of recovery within FinishWalRecovery(), the startup process using a buggy LSN to decide from where to start writing WAL. The cluster gets corrupted, still it is noisy about it. - v14 and older versions are worse: a cluster gets corrupted but it is entirely silent about the matter. The redo record missing causes the startup process to skip entirely recovery, because a missing record is the same as not redo being required at all. This leads to data loss, as everything is missed between the redo record and the checkpoint record. Note that I have tested that down to 9.4, reproducing the issue with a version of the author's reproducer slightly modified. The code is wrong since at least 9.2, but I did not look at the exact point of origin. This problem has been found by debugging a cluster where the WAL segment including the redo segment was missing due to an operator error, leading to a crash, based on an investigation in v15. Requesting archive recovery with the creation of a recovery.signal or a standby.signal even without a backup_label would mitigate the issue: if the record cannot be found in pg_wal/, the missing segment can be retrieved with a restore_command when checking that the redo record exists. This was already the case without this commit, where recovery would re-fetch the WAL segment that includes the redo record. The check introduced by this commit makes the segment to be retrieved earlier to make sure that the redo record can be found. On HEAD, the code will be slightly changed in a follow-up commit to not rely on a PANIC, to include a test able to emulate the original problem. This is a minimal backpatchable fix, kept separated for clarity. Reported-by: Andres Freund <andres@anarazel.de> Analyzed-by: Andres Freund <andres@anarazel.de> Author: Nitin Jadhav <nitinjadhavpostgres@gmail.com> Discussion: https://postgr.es/m/20231023232145.cmqe73stvivsmlhs@awork3.anarazel.de Discussion: https://postgr.es/m/CAMm1aWaaJi2w49c0RiaDBfhdCL6ztbr9m=daGqiOuVdizYWYaA@mail.gmail.com Backpatch-through: 14	2025-12-16 13:29:41 +09:00
Heikki Linnakangas	7d42e2367c	Clarify comment on multixid offset wraparound check Coverity complained that offset cannot be 0 here because there's an explicit check for "offset == 0" earlier in the function, but it didn't see the possibility that offset could've wrapped around to 0. The code is correct, but clarify the comment about it. The same code exists in backbranches in the server GetMultiXactIdMembers() function and in 'master' in the pg_upgrade GetOldMultiXactIdSingleMember function. In backbranches Coverity didn't complain about it because the check was merely an assertion, but change the comment in all supported branches for consistency. Per Tom Lane's suggestion. Discussion: https://www.postgresql.org/message-id/1827755.1765752936@sss.pgh.pa.us	2025-12-15 11:48:02 +02:00
Michael Paquier	5a4dc4aabd	Fix allocation formula in llvmjit_expr.c An array of LLVMBasicBlockRef is allocated with the size used for an element being "LLVMBasicBlockRef *" rather than "LLVMBasicBlockRef". LLVMBasicBlockRef is a type that refers to a pointer, so this did not directly cause a problem because both should have the same size, still it is incorrect. This issue has been spotted while reviewing a different patch, and exists since `2a0faed9d7`, so backpatch all the way down. Discussion: https://postgr.es/m/CA+hUKGLngd9cKHtTUuUdEo2eWEgUcZ_EQRbP55MigV2t_zTReg@mail.gmail.com Backpatch-through: 14	2025-12-11 10:25:48 +09:00
Thomas Munro	d62a258cd4	Fix O_CLOEXEC flag handling in Windows port. PostgreSQL's src/port/open.c has always set bInheritHandle = TRUE when opening files on Windows, making all file descriptors inheritable by child processes. This meant the O_CLOEXEC flag, added to many call sites by commit `1da569ca1f` (v16), was silently ignored. The original commit included a comment suggesting that our open() replacement doesn't create inheritable handles, but it was a mis- understanding of the code path. In practice, the code was creating inheritable handles in all cases. This hasn't caused widespread problems because most child processes (archive_command, COPY PROGRAM, etc.) operate on file paths passed as arguments rather than inherited file descriptors. Even if a child wanted to use an inherited handle, it would need to learn the numeric handle value, which isn't passed through our IPC mechanisms. Nonetheless, the current behavior is wrong. It violates documented O_CLOEXEC semantics, contradicts our own code comments, and makes PostgreSQL behave differently on Windows than on Unix. It also creates potential issues with future code or security auditing tools. To fix, define O_CLOEXEC to _O_NOINHERIT in master, previously used by O_DSYNC. We use different values in the back branches to preserve existing values. In pgwin32_open_handle() we set bInheritHandle according to whether O_CLOEXEC is specified, for the same atomic semantics as POSIX in multi-threaded programs that create processes. Backpatch-through: 16 Author: Bryan Green <dbryan.green@gmail.com> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> (minor adjustments) Discussion: https://postgr.es/m/e2b16375-7430-4053-bda3-5d2194ff1880%40gmail.com	2025-12-10 09:11:19 +13:00
Dean Rasheed	8348004b54	doc: Fix statement about ON CONFLICT and deferrable constraints. The description of deferrable constraints in create_table.sgml states that deferrable constraints cannot be used as conflict arbitrators in an INSERT with an ON CONFLICT DO UPDATE clause, but in fact this restriction applies to all ON CONFLICT clauses, not just those with DO UPDATE. Fix this, and while at it, change the word "arbitrators" to "arbiters", to match the terminology used elsewhere. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAEZATCWsybvZP3ce8rGcVNx-QHuDOJZDz8y=p1SzqHwjRXyV4Q@mail.gmail.com Backpatch-through: 14	2025-12-09 10:49:18 +00:00
David Rowley	08e1ea3b28	Doc: fix typo in hash index documentation Plus a similar fix to the README. Backpatch as far back as the sgml issue exists. The README issue does exist in v14, but that seems unlikely to harm anyone. Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ed3db7ea-55b4-4809-86af-81ad3bb2c7d3@gmail.com Backpatch-through: 15	2025-12-09 14:43:03 +13:00
Heikki Linnakangas	4d689a1769	Fix setting next multixid's offset at offset wraparound In commit `789d65364c`, we started updating the next multixid's offset too when recording a multixid, so that it can always be used to calculate the number of members. I got it wrong at offset wraparound: we need to skip over offset 0. Fix that. Discussion: https://www.postgresql.org/message-id/d9996478-389a-4340-8735-bfad456b313c@iki.fi Backpatch-through: 14	2025-12-05 11:36:36 +02:00
Michael Paquier	b38feca1ce	Show version of nodes in output of TAP tests This commit adds the version information of a node initialized by Cluster.pm, that may vary depending on the install_path given by the test. The code was written so as the node information, that includes the version number, was dumped before the version number was set. This is particularly useful for the pg_upgrade TAP tests, that may mix several versions for cross-version runs. The TAP infrastructure also allows mixing nodes with different versions, so this information can be useful for out-of-core tests. Backpatch down to v15, where Cluster.pm and the pg_upgrade TAP tests have been introduced. Author: Potapov Alexander <a.potapov@postgrespro.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/e59bb-692c0a80-5-6f987180@170377126 Backpatch-through: 15	2025-12-05 09:21:20 +09:00
Heikki Linnakangas	6351669130	Set next multixid's offset when creating a new multixid With this commit, the next multixid's offset will always be set on the offsets page, by the time that a backend might try to read it, so we no longer need the waiting mechanism with the condition variable. In other words, this eliminates "corner case 2" mentioned in the comments. The waiting mechanism was broken in a few scenarios: - When nextMulti was advanced without WAL-logging the next multixid. For example, if a later multixid was already assigned and WAL-logged before the previous one was WAL-logged, and then the server crashed. In that case the next offset would never be set in the offsets SLRU, and a query trying to read it would get stuck waiting for it. Same thing could happen if pg_resetwal was used to forcibly advance nextMulti. - In hot standby mode, a deadlock could happen where one backend waits for the next multixid assignment record, but WAL replay is not advancing because of a recovery conflict with the waiting backend. The old TAP test used carefully placed injection points to exercise the old waiting code, but now that the waiting code is gone, much of the old test is no longer relevant. Rewrite the test to reproduce the IPC/MultixactCreation hang after crash recovery instead, and to verify that previously recorded multixids stay readable. Backpatch to all supported versions. In back-branches, we still need to be able to read WAL that was generated before this fix, so in the back-branches this includes a hack to initialize the next offsets page when replaying XLOG_MULTIXACT_CREATE_ID for the last multixid on a page. On 'master', bump XLOG_PAGE_MAGIC instead to indicate that the WAL is not compatible. Author: Andrey Borodin <amborodin@acm.org> Reviewed-by: Dmitry Yurichev <dsy.075@yandex.ru> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Ivan Bykov <i.bykov@modernsys.ru> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/172e5723-d65f-4eec-b512-14beacb326ce@yandex.ru Backpatch-through: 14	2025-12-03 19:15:24 +02:00
Heikki Linnakangas	1829016268	Fix amcheck's handling of half-dead B-tree pages amcheck incorrectly reported the following error if there were any half-dead pages in the index: ERROR: mismatch between parent key and child high key in index "amchecktest_id_idx" It's expected that a half-dead page does not have a downlink in the parent level, so skip the test. Reported-by: Konstantin Knizhnik <knizhnik@garret.ru> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Reviewed-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Discussion: https://www.postgresql.org/message-id/33e39552-6a2a-46f3-8b34-3f9f8004451f@garret.ru Backpatch-through: 14	2025-12-02 21:15:48 +02:00
Heikki Linnakangas	f2a6df9fd5	Fix amcheck's handling of incomplete root splits in B-tree When the root page is being split, it's normal that root page according to the metapage is not marked BTP_ROOT. Fix bogus error in amcheck about that case. Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/abd65090-5336-42cc-b768-2bdd66738404@iki.fi Backpatch-through: 14	2025-12-02 21:15:43 +02:00
Dean Rasheed	4d288e33b9	Avoid rewriting data-modifying CTEs more than once. Formerly, when updating an auto-updatable view, or a relation with rules, if the original query had any data-modifying CTEs, the rewriter would rewrite those CTEs multiple times as RewriteQuery() recursed into the product queries. In most cases that was harmless, because RewriteQuery() is mostly idempotent. However, if the CTE involved updating an always-generated column, it would trigger an error because any subsequent rewrite would appear to be attempting to assign a non-default value to the always-generated column. This could perhaps be fixed by attempting to make RewriteQuery() fully idempotent, but that looks quite tricky to achieve, and would probably be quite fragile, given that more generated-column-type features might be added in the future. Instead, fix by arranging for RewriteQuery() to rewrite each CTE exactly once (by tracking the number of CTEs already rewritten as it recurses). This has the advantage of being simpler and more efficient, but it does make RewriteQuery() dependent on the order in which rewriteRuleAction() joins the CTE lists from the original query and the rule action, so care must be taken if that is ever changed. Reported-by: Bernice Southey <bernice.southey@gmail.com> Author: Bernice Southey <bernice.southey@gmail.com> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CAEDh4nyD6MSH9bROhsOsuTqGAv_QceU_GDvN9WcHLtZTCYM1kA@mail.gmail.com Backpatch-through: 14	2025-11-29 12:33:04 +00:00
Tom Lane	b497766a8e	Allow indexscans on partial hash indexes with implied quals. Normally, if a WHERE clause is implied by the predicate of a partial index, we drop that clause from the set of quals used with the index, since it's redundant to test it if we're scanning that index. However, if it's a hash index (or any !amoptionalkey index), this could result in dropping all available quals for the index's first key, preventing us from generating an indexscan. It's fair to question the practical usefulness of this case. Since hash only supports equality quals, the situation could only arise if the index's predicate is "WHERE indexkey = constant", implying that the index contains only one hash value, which would make hash a really poor choice of index type. However, perhaps there are other !amoptionalkey index AMs out there with which such cases are more plausible. To fix, just don't filter the candidate indexquals this way if the index is !amoptionalkey. That's a bit hokey because it may result in testing quals we didn't need to test, but to do it more accurately we'd have to redundantly identify which candidate quals are actually usable with the index, something we don't know at this early stage of planning. Doesn't seem worth the effort. Reported-by: Sergei Glukhov <s.glukhov@postgrespro.ru> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/e200bf38-6b45-446a-83fd-48617211feff@postgrespro.ru Backpatch-through: 14	2025-11-27 13:09:59 -05:00
Fujii Masao	fc6e1a0f2b	doc: Fix misleading synopsis for CREATE/ALTER PUBLICATION. The documentation for CREATE/ALTER PUBLICATION previously showed: [ ONLY ] table_name [ * ] [ ( column_name [, ... ] ) ] [ WHERE ( expression ) ] [, ... ] to indicate that the table/column specification could be repeated. However, placing [, ... ] directly after a multi-part construct was misleading and made it unclear which portion was repeatable. This commit introduces a new term, table_and_columns, to represent: [ ONLY ] table_name [ * ] [ ( column_name [, ... ] ) ] [ WHERE ( expression ) ] and updates the synopsis to use: table_and_columns [, ... ] which clearly identifies the repeatable element. Backpatched to v15, where the misleading syntax was introduced. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHut+PtsyvYL3KmA6C8f0ZpXQ=7FEqQtETVy-BOF+cm9WPvfMQ@mail.gmail.com Backpatch-through: 15	2025-11-27 23:31:43 +09:00
Daniel Gustafsson	54ba4a66fd	doc: Clarify passphrase command reloading on Windows When running on Windows (or EXEC_BACKEND) the SSL configuration will be reloaded on each backend start, so the passphrase command will be reloaded along with it. This implies that passphrase command reload must be enabled on Windows for connections to work at all. Document this since it wasn't mentioned explicitly, and will there add markup for parameter value to match the rest of the docs. Backpatch to all supported versions. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/5F301096-921A-427D-8EC1-EBAEC2A35082@yesql.se Backpatch-through: 14	2025-11-26 14:24:04 +01:00
Andres Freund	89c8a1b906	lwlock: Fix, currently harmless, bug in LWLockWakeup() Accidentally the code in LWLockWakeup() checked the list of to-be-woken up processes to see if LW_FLAG_HAS_WAITERS should be unset. That means that HAS_WAITERS would not get unset immediately, but only during the next, unnecessary, call to LWLockWakeup(). Luckily, as the code stands, this is just a small efficiency issue. However, if there were (as in a patch of mine) a case in which LWLockWakeup() would not find any backend to wake, despite the wait list not being empty, we'd wrongly unset LW_FLAG_HAS_WAITERS, leading to potentially hanging. While the consequences in the backbranches are limited, the code as-is confusing, and it is possible that there are workloads where the additional wait list lock acquisitions hurt, therefore backpatch. Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff Backpatch-through: 14	2025-11-24 17:39:58 -05:00
David Rowley	14cdab0292	Fix incorrect IndexOptInfo header comment The comment incorrectly indicated that indexcollations[] stored collations for both key columns and INCLUDE columns, but in reality it only has elements for the key columns. canreturn[] didn't get a mention, so add that while we're here. Author: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAEG8a3LwbZgMKOQ9CmZarX5DEipKivdHp5PZMOO-riL0w%3DL%3D4A%40mail.gmail.com Backpatch-through: 14	2025-11-24 17:01:34 +13:00
Thomas Munro	600acd34b0	jit: Adjust AArch64-only code for LLVM 21. LLVM 21 changed the arguments of RTDyldObjectLinkingLayer's constructor, breaking compilation with the backported SectionMemoryManager from commit `9044fc1d`. `cd585864c0` Backpatch-through: 14 Author: Holger Hoffstätte <holger@applied-asynchrony.com> Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/d25e6e4a-d1b4-84d3-2f8a-6c45b975f53d%40applied-asynchrony.com	2025-11-22 21:23:23 +13:00
Heikki Linnakangas	890cc81b6e	Print new OldestXID value in pg_resetwal when it's being changed Commit `74cf7d46a9` added the --oldest-transaction-id option to pg_resetwal, but forgot to update the code that prints all the new values that are being set. Fix that. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/5461bc85-e684-4531-b4d2-d2e57ad18cba@iki.fi Backpatch-through: 14	2025-11-19 18:06:24 +02:00
Tom Lane	1c8c3206f4	Don't allow CTEs to determine semantic levels of aggregates. The fix for bug #19055 (commit `b0cc0a71e`) allowed CTE references in sub-selects within aggregate functions to affect the semantic levels assigned to such aggregates. It turns out this broke some related cases, leading to assertion failures or strange planner errors such as "unexpected outer reference in CTE query". After experimenting with some alternative rules for assigning the semantic level in such cases, we've come to the conclusion that changing the level is more likely to break things than be helpful. Therefore, this patch undoes what `b0cc0a71e` changed, and instead installs logic to throw an error if there is any reference to a CTE that's below the semantic level that standard SQL rules would assign to the aggregate based on its contained Var and Aggref nodes. (The SQL standard disallows sub-selects within aggregate functions, so it can't reach the troublesome case and hence has no rule for what to do.) Perhaps someone will come along with a legitimate query that this logic rejects, and if so probably the example will help us craft a level-adjustment rule that works better than what `b0cc0a71e` did. I'm not holding my breath for that though, because the previous logic had been there for a very long time before bug #19055 without complaints, and that bug report sure looks to have originated from fuzzing not from real usage. Like `b0cc0a71e`, back-patch to all supported branches, though sadly that no longer includes v13. Bug: #19106 Reported-by: Kamil Monicz <kamil@monicz.dev> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19106-9dd3668a0734cd72@postgresql.org Backpatch-through: 14	2025-11-18 12:56:55 -05:00
Nathan Bossart	9a991d414e	Update .abi-compliance-history for change to CreateStatistics(). As noted in the commit message for `5e4fcbe531`, the addition of a second parameter to CreateStatistics() breaks ABI compatibility, but we are unaware of any impacted third-party code. This commit updates .abi-compliance-history accordingly. Backpatch-through: 14-18	2025-11-17 14:14:41 -06:00
Thomas Munro	a1407daded	Define PS_USE_CLOBBER_ARGV on GNU/Hurd. Until `d2ea2d310d`, the PS_USE_PS_STRINGS option was used on the GNU/Hurd. As this option got removed and PS_USE_CLOBBER_ARGV appears to work fine nowadays on the Hurd, define this one to re-enable process title changes on this platform. In the 14 and 15 branches, the existing test for __hurd__ (added 25 years ago by commit `209aa77d`, removed in 16 by the above commit) is left unchanged for now as it was activating slightly different code paths and would need investigation by a Hurd user. Author: Michael Banck <mbanck@debian.org> Discussion: https://postgr.es/m/CA%2BhUKGJMNGUAqf27WbckYFrM-Mavy0RKJvocfJU%3DJ2XcAZyv%2Bw%40mail.gmail.com Backpatch-through: 16	2025-11-17 12:48:22 +13:00
David Rowley	2791d49879	Doc: include MERGE in variable substitution command list Backpatch to 15, where MERGE was introduced. Reported-by: <emorgunov@mail.ru> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/176278494385.770.15550176063450771532@wrigleys.postgresql.org Backpatch-through: 15	2025-11-17 10:52:51 +13:00
Nathan Bossart	414e1ece9d	Add note about CreateStatistics()'s selective use of check_rights. Commit `5e4fcbe531` added a check_rights parameter to this function for use by ALTER TABLE commands that re-create statistics objects. However, we intentionally ignore check_rights when verifying relation ownership because this function's lookup could return a different answer than the caller's. This commit adds a note to this effect so that we remember it down the road. Reviewed-by: Noah Misch <noah@leadboat.com> Backpatch-through: 14	2025-11-14 13:20:09 -06:00
Dean Rasheed	8d43607cd4	doc: Improve description of RLS policies applied by command type. On the CREATE POLICY page, the "Policies Applied by Command Type" table was missing MERGE ... THEN DELETE and some of the policies applied during INSERT ... ON CONFLICT and MERGE. Fix that, and try to improve readability by listing the various MERGE cases separately, rather than together with INSERT/UPDATE/DELETE. Mention COPY ... TO along with SELECT, since it behaves in the same way. In addition, document which policy violations cause errors to be thrown, and which just cause rows to be silently ignored. Also, a paragraph above the table states that INSERT ... ON CONFLICT DO UPDATE only checks the WITH CHECK expressions of INSERT policies for rows appended to the relation by the INSERT path, which is incorrect -- all rows proposed for insertion are checked, regardless of whether they end up being inserted. Fix that, and also mention that the same applies to INSERT ... ON CONFLICT DO NOTHING. In addition, in various other places on that page, clarify how the different types of policy are applied to different commands, and whether or not errors are thrown when policy checks do not pass. Backpatch to all supported versions. Prior to v17, MERGE did not support RETURNING, and so MERGE ... THEN INSERT would never check new rows against SELECT policies. Prior to v15, MERGE was not supported at all. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Viktor Holmberg <v@viktorh.net> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CAEZATCWqnfeChjK=n1V_dYZT4rt4mnq+ybf9c0qXDYTVMsy8pg@mail.gmail.com Backpatch-through: 14	2025-11-13 12:03:52 +00:00

1 2 3 4 5 ...

57753 commits