postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-03-23 02:43:22 -04:00

Author	SHA1	Message	Date
Heikki Linnakangas	f30cebb954	Fix pointer type of ShmemAllocatorData->index This went unnoticed in commit `e2362eb2bd` because the pointer is cast to/from a void pointer.	2026-03-13 11:00:15 +02:00
Peter Eisentraut	59292f7aac	Change copyObject() to use typeof_unqual Currently, when the argument of copyObject() is const-qualified, the return type is also, because the use of typeof carries over all the qualifiers. This is incorrect, since the point of copyObject() is to make a copy to mutate. But apparently no code ran into it. The new implementation uses typeof_unqual, which drops the qualifiers, making this work correctly. typeof_unqual is standardized in C23, but all recent versions of all the usual compilers support it even in non-C23 mode, at least as __typeof_unqual__. We add a configure/meson test for typeof_unqual and __typeof_unqual__ and use it if it's available, else we use the existing fallback of just returning void *. This is the second attempt, after the first attempt in commit `4cfce4e62c` was reverted. The following two points address problems with the earlier version: We test the underscore variant first so that there is a higher chance that clang used for bitcode also supports it, since we don't test that separately. Unlike the typeof test, the typeof_unqual test also tests with a void pointer similar to how copyObject() would use it, because that is not handled by MSVC, so we want the test to fail there. Reviewed-by: David Geier <geidav.pg@gmail.com> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/92f9750f-c7f6-42d8-9a4a-85a3cbe808f3%40eisentraut.org	2026-03-13 07:06:57 +01:00
Andrew Dunstan	a0b6ef29a5	Enable fast default for domains with non-volatile constraints Previously, ALTER TABLE ADD COLUMN always forced a table rewrite when the column type was a domain with constraints (CHECK or NOT NULL), even if the default value satisfied those constraints. This was because contain_volatile_functions() considers CoerceToDomain immutable, so the code conservatively assumed any constrained domain might fail. Improve this by using soft error handling (ErrorSaveContext) to evaluate the CoerceToDomain expression at ALTER TABLE time. If the default value passes the domain's constraints, the value is stored as a "missing" attribute default and no table rewrite is needed. If the constraint check fails, we fall back to a table rewrite, preserving the historical behavior that constraint violations are only raised when the table actually contains rows. Domains with volatile constraint expressions always require a table rewrite since the constraint result could differ per evaluation and cannot be cached. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Viktor Holmberg <viktor.holmberg@aiven.io> Discussion: https://postgr.es/m/CACJufxE_+iZBR1i49k_AHigppPwLTJi6km8NOsC7FWvKdEmmXg@mail.gmail.com	2026-03-12 18:05:01 -04:00
Andrew Dunstan	487cf2cbd2	Extend DomainHasConstraints() to optionally check constraint volatility Add an optional bool *has_volatile output parameter to DomainHasConstraints(). When non-NULL, the function checks whether any CHECK constraint contains a volatile expression. Callers that don't need this information pass NULL and get the same behavior as before. This is needed by a subsequent commit that enables the fast default optimization for domains with non-volatile constraints: we can safely evaluate such constraints once at ALTER TABLE time, but volatile constraints require a full table rewrite. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Viktor Holmberg <viktor.holmberg@aiven.io> Discussion: https://postgr.es/m/CACJufxE_+iZBR1i49k_AHigppPwLTJi6km8NOsC7FWvKdEmmXg@mail.gmail.com	2026-03-12 18:04:16 -04:00
Peter Geoghegan	a367c433ad	Use simplehash for backend-private buffer pin refcounts. Replace dynahash with simplehash for the per-backend PrivateRefCountHash overflow table. Simplehash generates inlined, open-addressed lookup code, avoiding the per-call overhead of dynahash that becomes noticeable when many buffers are pinned with a CPU-bound workload. Motivated by testing of the index prefetching patch, which pins many more buffers concurrently than typical index scans. Author: Peter Geoghegan <pg@bowt.ie> Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-By: Tomas Vondra <tomas@vondra.me> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-Wz=g=JTSyDB4UtB5su2ZcvsS7VbP+ZMvvaG6ABoCb+s8Lw@mail.gmail.com	2026-03-12 13:26:16 -04:00
Peter Geoghegan	d071e1cfec	nbtree: Avoid allocating _bt_search stack. Avoid allocating memory for an nbtree descent stack during index scans. We only require a descent stack during inserts, when it is used to determine where to insert a new pivot tuple/downlink into the target leaf page's parent page in the event of a page split. (Page deletion's first phase also performs a _bt_search that requires a descent stack.) This optimization improves performance by minimizing palloc churn. It speeds up index scans that call _bt_search frequently/descend the index many times, especially when the cost of scanning the index dominates (e.g., with index-only skip scans). Testing has shown that the underlying issue causes performance problems for an upcoming patch that will replace btgettuple with a new btgetbatch interface to enable I/O prefetching. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAH2-Wzmy7NMba9k8m_VZ-XNDZJEUQBU8TeLEeL960-rAKb-+tQ@mail.gmail.com	2026-03-12 13:22:36 -04:00
Robert Haas	5883ff30b0	Add pg_plan_advice contrib module. Provide a facility that (1) can be used to stabilize certain plan choices so that the planner cannot reverse course without authorization and (2) can be used by knowledgeable users to insist on plan choices contrary to what the planner believes best. In both cases, terrible outcomes are possible: users should think twice and perhaps three times before constraining the planner's ability to do as it thinks best; nevertheless, there are problems that are much more easily solved with these facilities than without them. This patch takes the approach of analyzing a finished plan to produce textual output, which we call "plan advice", that describes key decisions made during plan; if that plan advice is provided during future planning cycles, it will force those key decisions to be made in the same way. Not all planner decisions can be controlled using advice; for example, decisions about how to perform aggregation are currently out of scope, as is choice of sort order. Plan advice can also be edited by the user, or even written from scratch in simple cases, making it possible to generate outcomes that the planner would not have produced. Partial advice can be provided to control some planner outcomes but not others. Currently, plan advice is focused only on specific outcomes, such as the choice to use a sequential scan for a particular relation, and not on estimates that might contribute to those outcomes, such as a possibly-incorrect selectivity estimate. While it would be useful to users to be able to provide plan advice that affects selectivity estimates or other aspects of costing, that is out of scope for this commit. Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Greg Burd <greg@burd.me> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Haibo Yan <tristan.yim@gmail.com> Reviewed-by: Dian Fay <di@nmfay.com> Reviewed-by: Ajay Pal <ajay.pal.k@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: http://postgr.es/m/CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com	2026-03-12 13:00:43 -04:00
Michael Paquier	6c228755ad	Use streaming read for VACUUM cleanup of GIN This commit replace the synchronous ReadBufferExtended() loop done in ginvacuumcleanup() with the streaming read equivalent, to improve I/O efficiency during GIN index vacuum cleanup operations. With dm_delay to emulate some latency and debug_io_direct=data to force synchronous writes and force the read path to be exercised, the author has noticed a 5x improvement in runtime, with a substantial reduction in IO stats numbers. I have reproduced similar numbers while running similar tests, with improvements becoming better with more tuples and more pages manipulated. Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CABPTF7VrqfbcDXqGrdLQ2xaQ=K0RzExNuw6U_GGqzSJu32wfdQ@mail.gmail.com	2026-03-12 11:48:31 +09:00
Richard Guo	383eb21ebf	Convert NOT IN sublinks to anti-joins when safe The planner has historically been unable to convert "x NOT IN (SELECT y ...)" sublinks into anti-joins. This is because standard SQL semantics for NOT IN require that if the comparison "x = y" returns NULL, the "NOT IN" expression evaluates to NULL (effectively false), causing the row to be discarded. In contrast, an anti-join preserves the row if no match is found. Due to this semantic mismatch regarding NULL handling, the conversion was previously considered unsafe. However, if we can prove that neither side of the comparison can yield NULL values, and further that the operator itself cannot return NULL for non-null inputs, the behavior of NOT IN and anti-join becomes identical. Enabling this conversion allows the planner to treat the sublink as a first-class relation rather than an opaque SubPlan filter. This unlocks global join ordering optimization and permits the selection of the most efficient join algorithm based on cost, often yielding significant performance improvements for large datasets. This patch verifies that neither side of the comparison can be NULL and that the operator is safe regarding NULL results before performing the conversion. To verify operator safety, we require that the operator be a member of a B-tree or Hash operator family. This serves as a proxy for standard boolean behavior, ensuring the operator does not return NULL on valid non-null inputs, as doing so would break index integrity. For operand non-nullability, this patch makes use of several existing mechanisms. It leverages the outer-join-aware-Var infrastructure to verify that a Var does not come from the nullable side of an outer join, and consults the NOT-NULL-attnums hash table to efficiently verify schema-level NOT NULL constraints. Additionally, it employs find_nonnullable_vars to identify Vars forced non-nullable by qual clauses, and expr_is_nonnullable to deduce non-nullability for other expression types. The logic for verifying the non-nullability of the subquery outputs was adapted from prior work by David Rowley and Tom Lane. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Zhang Mingli <zmlpostgres@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/CAMbWs495eF=-fSa5CwJS6B-BaEi3ARp0UNb4Lt3EkgUGZJwkAQ@mail.gmail.com	2026-03-12 09:45:18 +09:00
Andres Freund	6322a028fa	bufmgr: Fix use of wrong variable in GetPrivateRefCountEntrySlow() Unfortunately, in `30df61990c`, I made GetPrivateRefCountEntrySlow() set a wrong cache hint when moving entries from the hash table to the faster array. There are no correctness concerns due to this, just an unnecessary loss of performance. Noticed while testing the index prefetching patch. Discussion: https://postgr.es/m/CAH2-Wz=g=JTSyDB4UtB5su2ZcvsS7VbP+ZMvvaG6ABoCb+s8Lw@mail.gmail.com	2026-03-11 17:52:21 -04:00
Jeff Davis	547c15f9f8	Fix use of volatile. Commit `8185bb5347` misused volatile. Fix it. See also `6307b096e2`. Reported-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/1bb21c7d-885f-4f07-a3ed-21b60d7c92c6@eisentraut.org	2026-03-11 14:27:58 -07:00
Andrew Dunstan	342051d73b	Add support for altering CHECK constraint enforceability This commit adds support for ALTER TABLE ALTER CONSTRAINT ... [NOT] ENFORCED for CHECK constraints. Previously, only foreign key constraints could have their enforceability altered. When changing from NOT ENFORCED to ENFORCED, the operation not only updates catalog information but also performs a full table scan in Phase 3 to validate that existing data satisfies the constraint. For partitioned tables and inheritance hierarchies, the operation recurses to all child tables. When changing to NOT ENFORCED, we must recurse even if the parent is already NOT ENFORCED, since child constraints may still be ENFORCED. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Amul Sul <sulamul@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@cybertec.at> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CACJufxHCh_FU-FsEwsCvg9mN6-5tzR6H9ntn+0KUgTCaerDOmg@mail.gmail.com	2026-03-11 16:15:35 -04:00
Andrew Dunstan	a9747153e1	rename alter constraint enforceability related functions The functions AlterConstrEnforceabilityRecurse and ATExecAlterConstrEnforceability are being renamed to AlterFKConstrEnforceabilityRecurse and ATExecAlterFKConstrEnforceability, respectively. The current alter constraint functions only handle Foreign Key constraints. Renaming them to be more explicit about the constraint type is necessary; otherwise, it will cause confusion when we later introduce the ability to alter the enforceability of other constraints. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Amul Sul <sulamul@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/CACJufxHCh_FU-FsEwsCvg9mN6-5tzR6H9ntn+0KUgTCaerDOmg@mail.gmail.com	2026-03-11 16:14:58 -04:00
Andres Freund	a766125efd	bufmgr: Switch to standard order in MarkBufferDirtyHint() When we were updating hint bits with just a share lock MarkBufferDirtyHint() had to use a non-standard order of operations, i.e. WAL log the buffer before marking the buffer dirty. This was required because the lock level used to set hints did not conflict with the lock level that was used to flush pages, which would have allowed flushing the page out before the WAL record. The non-standard order in turn required preventing the checkpoint from starting between writing the WAL record and flushing out the page. Now that setting hints and writing out buffers use share-exclusive, we can revert back to the normal order of operations. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d	2026-03-11 14:58:29 -04:00
Andres Freund	b0f4ff3c92	bufmgr: Remove the, now obsolete, BM_JUST_DIRTIED Due to the recent changes to use a share-exclusive mode for setting hint bits and for flushing pages - instead of using share mode as before - a buffer cannot be dirtied while the flush is ongoing. The reason we needed JUST_DIRTIED was to handle the case where the buffer was dirtied while IO was ongoing - which is not possible anymore. Discussion: https://postgr.es/m/5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d	2026-03-11 14:58:29 -04:00
Melanie Plageman	11e0824bd9	Avoid WAL flush checks for unlogged buffers in GetVictimBuffer() GetVictimBuffer() rejects a victim buffer if it is from a bulkread strategy ring and reusing it would require flushing WAL. Unlogged table buffers can have fake LSNs (e.g. unlogged GiST pages) and calling XLogNeedsFlush() on a fake LSN is meaningless. This is a bit of future-proofing because currently the bulkread strategy is not used for relations with fake LSNs. Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Earlier version reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/flat/fmkqmyeyy7bdpvcgkheb6yaqewemkik3ls6aaveyi5ibmvtxnd%40nu2kvy5rq3a6	2026-03-11 14:50:50 -04:00
Tomas Vondra	943e881733	Do not lock in BufferGetLSNAtomic() on archs with 8 byte atomic reads On platforms where we can read or write the whole LSN atomically, we do not need to lock the buffer header to prevent torn LSNs. We can do this only on platforms with PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY, and when the pd_lsn field is properly aligned. For historical reasons the PageXLogRecPtr was defined as a struct with two uint32 fields. This replaces it with a single uint64 value, to make the intent clearer. To prevent issues with weak typedefs the value is still wrapped in a struct. This also adjusts heapfuncs() in pageinspect, to ensure proper alignment when reading the LSN from a page on alignment-sensitive hardware. Idea by Andres Freund. Initial patch by Andreas Karlsson, improved by Peter Geoghegan. Minor tweaks by me. Author: Andreas Karlsson <andreas@proxel.se> Author: Peter Geoghegan <pg@bowt.ie> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/b6610c3b-3f59-465a-bdbb-8e9259f0abc4@proxel.se	2026-03-11 19:46:08 +01:00
Tomas Vondra	b6eb8dde6b	Fix indentation from commit `29a0fb2157` Per buildfarm animal koel	2026-03-11 15:14:46 +01:00
Tomas Vondra	29a0fb2157	Conditional locking in pgaio_worker_submit_internal With io_method=worker, there's a single I/O submission queue. With enough workers, the backends and workers may end up spending a lot of time competing for the AioWorkerSubmissionQueueLock lock. This can happen with workloads that keep the queue full, in which case it's impossible to add requests to the queue. Increasing the number of I/O workers increases the pressure on the lock, worsening the issue. This change improves the situation in two ways: * If AioWorkerSubmissionQueueLock can't be acquired without waiting, the I/O is performed synchronously (as if the queue was full). * When an entry can't be added to a full queue, stop trying to add more entries. All remaining entries are handled as synchronous I/O. The regression was reported by Alexandre Felipe. Investigation and patch by me, based on an idea by Andres Freund. Reported-by: Alexandre Felipe <o.alexandre.felipe@gmail.com> Author: Tomas Vondra <tomas@vondra.me> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAE8JnxOn4+xUAnce+M7LfZWOqfrMMxasMaEmSKwiKbQtZr65uA@mail.gmail.com	2026-03-11 13:40:23 +01:00
Peter Eisentraut	9c05f152b5	Fixes for C++ typeof implementation This fixes two bugs in commit `1887d822f1`. First, if we are using the fallback C++ implementation of typeof, then we need to include the C++ header <type_traits> for std::remove_reference_t. This header is also likely to be used for other C++ implementations of type tricks, so we'll put it into the global includes. Second, for the case that the C compiler supports typeof in a spelling that is not "typeof" (for example, __typeof__), then we need to #undef typeof in the C++ section to avoid warnings about duplicate macro definitions. Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/92f9750f-c7f6-42d8-9a4a-85a3cbe808f3%40eisentraut.org	2026-03-11 11:54:10 +01:00
Peter Eisentraut	d4a080b8a1	Remove Int8GetDatum function We have no uses of Int8GetDatum in our tree and did not have for a long time (or never), and the inverse does not exist either. Author: Kirill Reshke <reshkekirill@gmail.com> Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/CALdSSPhFyb9qLSHee73XtZm1CBWJNo9+JzFNf-zUEWCRW5yEiQ@mail.gmail.com	2026-03-11 10:46:08 +01:00
Peter Eisentraut	d537f59fbb	Sort out table_open vs. relation_open in rewriter table_open() is a wrapper around relation_open() that checks that the relkind is table-like and gives a user-facing error message if not. It is best used in directly user-facing areas to check that the user used the right kind of command for the relkind. In internal uses where the relkind was previously checked from the user's perspective, table_open() is not necessary and might even be confusing if it were to give out-of-context error messages. In rewriteHandler.c, there were several such table_open() calls, which this changes to relation_open(). This currently doesn't make a difference, but there are plans to have other relkinds that could appear in the rewriter but that shouldn't be accessible via table-specific commands, and this clears the way for that. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6d3fef19-a420-4e11-8235-8ea534bf2080%40eisentraut.org Discussion: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org	2026-03-11 09:22:11 +01:00
Andres Freund	82467f627b	Require share-exclusive lock to set hint bits and to flush At the moment hint bits can be set with just a share lock on a page (and, until `45f658dacb`, in one case even without any lock). Because of this we need to copy pages while writing them out, as otherwise the checksum could be corrupted. The need to copy the page is problematic to implement AIO writes: 1) Instead of just needing a single buffer for a copied page we need one for each page that's potentially undergoing I/O 2) To be able to use the "worker" AIO implementation the copied page needs to reside in shared memory It also causes problems for using unbuffered/direct-IO, independent of AIO: Some filesystems, raid implementations, ... do not tolerate the data being written out to change during the write. E.g. they may compute internal checksums that can be invalidated by concurrent modifications, leading e.g. to filesystem errors (as the case with btrfs). It also just is plain odd to allow modifications of buffers that are just share locked. To address these issues, this commit changes the rules so that modifications to pages are not allowed anymore while holding a share lock. Instead the new share-exclusive lock (introduced in `fcb9c977aa`) allows at most one backend to modify a buffer while other backends have the same page share locked. An existing share-lock can be upgraded to a share-exclusive lock, if there are no conflicting locks. For that BufferBeginSetHintBits()/BufferFinishSetHintBits() and BufferSetHintBits16() have been introduced. To prevent hint bits from being set while the buffer is being written out, writing out buffers now requires a share-exclusive lock. The use of share-exclusive to gate setting hint bits means that from now on only one backend can set hint bits at a time. To allow multiple backends to set hint bits would require more complicated locking: For setting hint bits we'd need to store the count of backends currently setting hint bits and we would need another lock-level for I/O conflicting with the lock-level to set hint bits. Given that the share-exclusive lock for setting hint bits is only held for a short time, that backends would often just set the same hint bits and that the cost of occasionally not setting hint bits in hotly accessed pages is fairly low, this seems like an acceptable tradeoff. The biggest change to adapt to this is in heapam. To avoid performance regressions for sequential scans that need to set a lot of hint bits, we need to amortize the cost of BufferBeginSetHintBits() for cases where hint bits are set at a high frequency. To that end HeapTupleSatisfiesMVCCBatch() uses the new SetHintBitsExt(), which defers BufferFinishSetHintBits() until all hint bits on a page have been set. Conversely, to avoid regressions in cases where we can't set hint bits in bulk (because we're looking only at individual tuples), use BufferSetHintBits16() when setting hint bits without batching. Several other places also need to be adapted, but those changes are comparatively simpler. After this we do not need to copy buffers to write them out anymore. That change is done separately however. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf%40gcnactj4z56m	2026-03-10 19:32:13 -04:00
Melanie Plageman	4c7362c553	Remove unused PruneState member frz_conflict_horizon `c2a23dcf9e` removed use of PruneState.frz_conflict_horizon but neglected to actually remove the member. Do that now.	2026-03-10 18:31:00 -04:00
Heikki Linnakangas	138592d1b0	Don't clear pendingRecoveryConflicts at end of transaction Commit `17f51ea818` introduced a new pendingRecoveryConflicts field in PGPROC to replace the various ProcSignals. The new field was cleared in ProcArrayEndTransaction(), which makes sense for conflicts with e.g. locks or buffer pins which are gone at end of transaction. But it is not appropriate for conflicts on a database, or a logical slot. Because of this, the 035_standby_logical_decoding.pl test was occasionally getting stuck in the buildfarm. It happens if the startup process signals recovery conflict with the logical slot just when the walsender process using the slot calls ProcArrayEndTransaction(). To fix, don't clear pendingRecoveryConflicts in ProcArrayEndTransaction(). We could still clear certain conflict flags, like conflicts on locks, but we didn't try to do that before commit `17f51ea818` either. In the passing, fix a misspelled comment, and make InitAuxiliaryProcess() to also clear pendingRecoveryConflicts. I don't think aux processes can have recovery conflicts, but it seems best to initialize the field and keep InitAuxiliaryProcess() as close to InitProcess() as possible. Analyzed-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://www.postgresql.org/message-id/3e07149d-060b-48a0-8f94-3d5e4946ae45@gmail.com	2026-03-11 00:06:09 +02:00
Melanie Plageman	c2a23dcf9e	Use the newest to-be-frozen xid as the conflict horizon for freezing Previously WAL records that froze tuples used OldestXmin as the snapshot conflict horizon, or the visibility cutoff if the page would become all-frozen. Both are newer than (or equal to) the newst XID actually frozen on the page. Track the newest XID that will be frozen and use that as the snapshot conflict horizon instead. This yields an older horizon resulting in fewer query cancellations on standbys. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAAKRu_bbaUV8OUjAfVa_iALgKnTSfB4gO3jnkfpcFgrxEpSGJQ%40mail.gmail.com	2026-03-10 15:24:39 -04:00
Álvaro Herrera	ac58465e06	Introduce the REPACK command REPACK absorbs the functionality of VACUUM FULL and CLUSTER in a single command. Because this functionality is completely different from regular VACUUM, having it separate from VACUUM makes it easier for users to understand; as for CLUSTER, the term is heavily overloaded in the IT world and even in Postgres itself, so it's good that we can avoid it. We retain those older commands, but de-emphasize them in the documentation, in favor of REPACK; the difference between VACUUM FULL and CLUSTER (namely, the fact that tuples are written in a specific ordering) is neatly absorbed as two different modes of REPACK. This allows us to introduce further functionality in the future that works regardless of whether an ordering is being applied, such as (and especially) a concurrent mode. Author: Antonin Houska <ah@cybertec.at> Reviewed-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/82651.1720540558@antos Discussion: https://postgr.es/m/202507262156.sb455angijk6@alvherre.pgsql	2026-03-10 19:56:39 +01:00
Masahiko Sawada	a596d27d80	Fix grammar in short description of effective_wal_level. Align with the convention of using third-person singular (e.g., "Shows" instead of "Show") for GUC parameter descriptions. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20260210.143752.1113524465620875233.horikyota.ntt@gmail.com	2026-03-10 11:36:38 -07:00
Andres Freund	f4a4ce52c0	heapam: Don't mimic MarkBufferDirtyHint() in inplace updates Previously heap_inplace_update_and_unlock() used an operation order similar to MarkBufferDirty(), to reduce the number of different approaches used for updating buffers. However, in an upcoming patch, MarkBufferDirtyHint() will switch to using the update protocol used by most other places (enabled by hint bits only being set while holding a share-exclusive lock). Luckily it's pretty easy to adjust heap_inplace_update_and_unlock(). As a comment already foresaw, we can use the normal order, with the slight change of updating the buffer contents after WAL logging. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d	2026-03-10 11:58:06 -04:00
Álvaro Herrera	a198c26ded	pg_dumpall: simplify coding of dropDBs() There's no need for a StringInfo when all you want is a string being constructed in a single pass. Author: Álvaro Herrera <alvherre@kurilemu.de> Reported-by: Ranier Vilela <ranier.vf@gmail.com> Reviewed-by: Yang Yuanzhuo <1197620467@qq.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CAEudQAq2wyXZRdsh+wVHcOrungPU+_aQeQU12wbcgrmE0bQovA@mail.gmail.com	2026-03-10 16:00:19 +01:00
Fujii Masao	59bae23435	Remove duplicate initialization in initialize_brin_buildstate(). Commit `dae761a` added initialization of some BrinBuildState fields in initialize_brin_buildstate(). Later, commit `b437571` inadvertently added the same initialization again. This commit removes that redundant initialization. No behavioral change is intended. Author: Chao Li <lic@highgo.com> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://postgr.es/m/CAEoWx2nmrca6-9SNChDvRYD6+r==fs9qg5J93kahS7vpoq8QVg@mail.gmail.com	2026-03-10 22:55:11 +09:00
Peter Eisentraut	8080f44f96	Rename grammar nonterminal to simplify reuse A list of expressions with optional AS-labels is useful in a few different places. Right now, this is available as xml_attribute_list because it was first used in the XMLATTRIBUTES construct, but it is already used elsewhere, and there are other possible future uses. To reduce possible confusion going forward, rename it to labeled_expr_list (like existing expr_list plus ColLabel). Discussion: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org	2026-03-10 14:09:09 +01:00
Robert Haas	0fbfd37cef	Allow extensions to mark an individual index as disabled. Up until now, the only way for a loadable module to disable the use of a particular index was to use build_simple_rel_hook (or, previous to yesterday's commit, get_relation_info_hook) to remove it from the index list. While that works, it has some disadvantages. First, the index becomes invisible for all purposes, and can no longer be used for optimizations such as self-join elimination or left join removal, which can severely degrade the resulting plan. Second, if the module attempts to compel the use of a certain index by removing all other indexes from the index list and disabling other scan types, but the planner is unable to use the chosen index for some reason, it will fall back to a sequential scan, because that is only disabled, whereas the other indexes are, from the planner's point of view, completely gone. While this situation ideally shouldn't occur, it's hard for a loadable module to be completely sure whether the planner will view a certain index as usable for a certain query. If it isn't, it may be better to fall back to a scan using a disabled index rather than falling back to an also-disabled sequential scan. Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: http://postgr.es/m/CA%2BTgmoYS4ZCVAF2jTce%3DbMP0Oq_db_srocR4cZyO0OBp9oUoGg%40mail.gmail.com	2026-03-10 08:33:55 -04:00
Michael Paquier	03facc1211	Switch to FATAL error for missing checkpoint record without backup_label Crash recovery started without a backup_label previously crashed with a PANIC if the checkpoint record could not be found. This commit lowers the report generated to be a FATAL instead. With recovery methods being more imaginative these days, this should provide more flexibility when handling PostgreSQL recovery processing in the event of a driver error, similarly to `15f68cebdc`. An extra benefit of this change is that it becomes possible to add a test to check that a FATAL is hit with an expected error message pattern. With the recovery code becoming more complicated over the last couple of years, I suspect that this will be benefitial to cover in the long-term. The original PANIC behavior has been introduced in the early days of crash recovery, as of `4d14fe0048` (PANIC did not exist yet, the code used STOP). Author: Nitin Jadhav <nitinjadhavpostgres@gmail.com> Discussion: https://postgr.es/m/CAMm1aWZbQ-Acp_xAxC7mX9uZZMH8+NpfepY9w=AOxbBVT9E=uA@mail.gmail.com	2026-03-10 12:00:05 +09:00
Michael Paquier	6307b096e2	Fix misuse of "volatile" in xml.c What should be used is not "volatile foo ptr" but "foo volatile ptr", The incorrect (former) style means that what the pointer variable points to is volatile. The correct (latter) style means that the pointer variable itself needs to be treated as volatile. The latter style is required to ensure a consistent treatment of these variables after a longjmp with the TRY/CATCH blocks. Some casts can be removed thanks to this change. Issue introduced by `2e94721747`, so no backpatch is required. A similar set of issues has been fixed in `93001888d8` for contrib/xml2/. Author: ChangAo Chen <cca5507@qq.com> Discussion: https://postgr.es/m/tencent_5BE8DAD985EE140ED62EA728C8D4E1311F0A@qq.com	2026-03-10 07:05:32 +09:00
Nathan Bossart	7c8280eeb5	pg_{dump,restore}: Refactor handling of conflicting options. This commit makes use of the function added by commit `b2898baaf7` for these applications' handling of conflicting options. It doesn't fix any bugs, but it does trim several lines of code. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Steven Niu <niushiji@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CACJufxHDYn%2B3-2jR_kwYB0U7UrNP%2B0EPvAWzBBD5EfUzzr1uiw%40mail.gmail.com	2026-03-09 11:37:46 -05:00
Robert Haas	91f33a2ae9	Replace get_relation_info_hook with build_simple_rel_hook. For a long time, PostgreSQL has had a get_relation_info_hook which plugins can use to editorialize on the information that get_relation_info obtains from the catalogs. However, this hook is only called for baserels of type RTE_RELATION, and there is potential utility in a similar call back for other types of RTEs. This might have had utility even before commit `4020b370f2` added pgs_mask to RelOptInfo, but it certainly has utility now. So, move the callback up one level, deleting get_relation_info_hook and adding build_simple_rel_hook instead. The new callback is called just slightly later than before and with slightly different arguments, but it should be fairly straightforward to adjust existing code that currently uses get_relation_info_hook: the values previously available as relationObjectId and inhparent are now available via rte->relid and rte->inh, and calls where rte->rtekind != RTE_RELATION can be ignored if desired. Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: http://postgr.es/m/CA%2BTgmoYg8uUWyco7Pb3HYLMBRQoO6Zh9hwgm27V39Pb6Pdf%3Dug%40mail.gmail.com	2026-03-09 09:48:26 -04:00
Robert Haas	8300d3ad4a	Consider startup cost as a figure of merit for partial paths. Previously, the comments stated that there was no purpose to considering startup cost for partial paths, but this is not the case: it's perfectly reasonable to want a fast-start path for a plan that involves a LIMIT (perhaps over an aggregate, so that there is enough data being processed to justify parallel query but yet we don't want all the result rows). Accordingly, rewrite add_partial_path and add_partial_path_precheck to consider startup costs. This also fixes an independent bug in add_partial_path_precheck: commit `e222534679` failed to update it to do anything with the new disabled_nodes field. That bug fix is formally separate from the rest of this patch and could be committed separately, but I think it makes more sense to fix both issues together, because then we can (as this commit does) just make add_partial_path_precheck do the cost comparisons in the same way as compare_path_costs_fuzzily, which hopefully reduces the chances of ending up with something that's still incorrect. This patch is based on earlier work on this topic by Tomas Vondra, but I have rewritten a great deal of it. Co-authored-by: Robert Haas <rhaas@postgresql.org> Co-authored-by: Tomas Vondra <tomas@vondra.me> Discussion: http://postgr.es/m/CA+TgmobRufbUSksBoxytGJS1P+mQY4rWctCk-d0iAUO6-k9Wrg@mail.gmail.com	2026-03-09 08:16:30 -04:00
Robert Haas	ffc226ab64	Prevent restore of incremental backup from bloating VM fork. When I (rhaas) wrote the WAL summarizer code, I incorrectly believed that XLOG_SMGR_TRUNCATE truncates all forks to the same length. In fact, what other parts of the code do is compute the truncation length for the FSM and VM forks from the truncation length used for the main fork. But, because I was confused, I coded the WAL summarizer to set the limit block for the VM fork to the same value as for the main fork. (Incremental backup always copies FSM forks in full, so there is no similar issue in that case.) Doing that doesn't directly cause any data corruption, as far as I can see. However, it does create a serious risk of consuming a large amount of extra disk space, because pg_combinebackup's reconstruct.c believes that the reconstructed file should always be at least as long as the limit block value. We might want to be smarter about that at some point in the future, because it's always safe to omit all-zeroes blocks at the end of the last segment of a relation, and doing so could save disk space, but the current algorithm will rarely waste enough disk space to worry about unless we believe that a relation has been truncated to a length much longer than its actual length on disk, which is exactly what happens as a result of the problem mentioned in the previous paragraph. To fix, create a new visibilitymap helper function and use it to include the right limit block in the summary files. Incremental backups taken with existing summary files will still have this issue, but this should improve the situation going forward. Diagnosed-by: Oleg Tkachenko <oatkachenko@gmail.com> Diagnosed-by: Amul Sul <sulamul@gmail.com> Discussion: http://postgr.es/m/CAAJ_b97PqG89hvPNJ8cGwmk94gJ9KOf_pLsowUyQGZgJY32o9g@mail.gmail.com Discussion: http://postgr.es/m/6897DAF7-B699-41BF-A6FB-B818FCFFD585%40gmail.com Backpatch-through: 17	2026-03-09 06:45:32 -04:00
Amit Kapila	06d8302262	Remove trailing period from errmsg in subscriptioncmds.c. Author: Sahitya Chandra <sahityajb@gmail.com> Discussion: https://postgr.es/m/20260308142806.181309-1-sahityajb@gmail.com	2026-03-09 15:10:03 +05:30
Fujii Masao	173aa8c5e8	doc: Document IF NOT EXISTS option for ALTER FOREIGN TABLE ADD COLUMN. Commit `2cd40adb85` added the IF NOT EXISTS option to ALTER TABLE ADD COLUMN. This also enabled IF NOT EXISTS for ALTER FOREIGN TABLE ADD COLUMN, but the ALTER FOREIGN TABLE documentation was not updated to mention it. This commit updates the documentation to describe the IF NOT EXISTS option for ALTER FOREIGN TABLE ADD COLUMN. While updating that section, also this commit clarifies that the COLUMN keyword is optional in ALTER FOREIGN TABLE ADD/DROP COLUMN. Previously, part of the documentation could be read as if COLUMN were required. This commit adds regression tests covering these ALTER FOREIGN TABLE syntaxes. Backpatch to all supported versions. Suggested-by: Fujii Masao <masao.fujii@gmail.com> Author: Chao Li <lic@highgo.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHGQGwFk=rrhrwGwPtQxBesbT4DzSZ86Q3ftcwCu3AR5bOiXLw@mail.gmail.com Backpatch-through: 14	2026-03-09 18:23:36 +09:00
Michael Paquier	4da2afd01f	Fix size underestimation of DSA pagemap for odd-sized segments When make_new_segment() creates an odd-sized segment, the pagemap was only sized based on a number of usable_pages entries, forgetting that a segment also contains metadata pages, and that the FreePageManager uses absolute page indices that cover the entire segment. This miscalculation could cause accesses to pagemap entries to be out of bounds. During subsequent reuse of the allocated segment, allocations landing on pages with indices higher than usable_pages could cause out-of-bounds pagemap reads and/or writes. On write, 'span' pointers are stored into the data area, corrupting the allocated objects. On read (aka during a dsa_free), garbage is interpreted as a span pointer, typically crashing the server in dsa_get_address(). The normal geometric path correctly sizes the pagemap for all pages in the segment. The odd-sized path needs to do the same, but it works forward from usable_pages rather than backward from total_size. This commit fixes the sizing of the odd-sized case by adding pagemap entries for the metadata pages after the initial metadata_bytes calculation, using an integer ceiling division to compute the exact number of additional entries needed in one go, avoiding any iteration in the calculation. An assertion is added in the code path for odd-sized segments, ensuring that the pagemap includes the metadata area, and that the result is appropriately sized. This problem would show up depending on the size requested for the allocation of a DSA segment. The reporter has noticed this issue when a parallel hash join makes a DSA allocation large enough to trigger the odd-sized segment path, but it could happen for anything that does a DSA allocation. A regression test is added to test_dsa, down to v17 where the test module has been introduced. This adds a set of cheap tests to check the problem, the new assertion being useful for this purpose. Sami has proposed a test that took a longer time than what I have done here; the test committed is faster and good enough to check the odd-sized allocation path. Author: Paul Bunn <paul.bunn@icloud.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/044401dcabac$fe432490$fac96db0$@icloud.com Backpatch-through: 14	2026-03-09 13:46:27 +09:00
Michael Paquier	ccd7abaa45	Refactor tests for catalog diff comparisons in stats_import.sql The tests of stats_import.sql include a set of queries to do differential checks of the three statistics catalog relations, based on the comparison of a source relation and a target relation, used for the copy of the stats data with the restore functions: - pg_statistic - pg_stats_ext - pg_stats_ext_exprs This commit refactors the tests to reduce the bloat of such differential queries, by creating a set of objects that make the differential queries smaller: - View for a base relation type. - First function to retrieve stats data, that returns a type based on the view previously created. - Second function that checks the difference, based on two calls of the first function. This change leads to a nice reduction of stats_import.sql, with a larger effect on the output file. While on it, this adds some sanity checks for the three catalogs, to warn developers that the stats import facilities may need to be updated if any of the three catalogs change. These are rare in practice, see `918eee0c49` as one example. Another stylistic change is the use of the extended output format for the differential queries, so as we avoid long lines of output if a diff is caught. Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CADkLM=eEhxJpSUP+eC=eMGZZsVOpnfKDvVkuCbsFg9CajYwDsA@mail.gmail.com	2026-03-09 08:46:06 +09:00
Michael Paquier	9e8193a262	Fix typo in stats_import.sql The test mentioned pg_stat_ext_exprs, but the correct catalog name is pg_stats_ext_exprs. Thinko in `ba97bf9cb7`. Discussion: https://postgr.es/m/CADkLM=eEhxJpSUP+eC=eMGZZsVOpnfKDvVkuCbsFg9CajYwDsA@mail.gmail.com	2026-03-09 07:15:26 +09:00
Álvaro Herrera	eb2c867b0a	Fix invalid boolean if-test We were testing the truth value of the array of booleans (which is always true) instead of the boolean element specific to the affected table column. This causes a binary-upgrade dump fail to omit the name of a constraint; that is, the correct constraint name is always printed, even when it's not needed. The affected case is a binary-upgrade dump of a not-null constraint in an inherited column, which must in addition have no comment. Another point is that in order for this to make a difference, the constraint must have the default name in the child table. That is, the constraint must have been created _in the parent table_ with the name that it would have in the child table, like so: CREATE TABLE parent (a int CONSTRAINT child_a_not_null NOT NULL); CREATE TABLE child () INHERITS (parent); Otherwise, the correct name must be printed by binary-upgrade pg_dump anyway, since it wouldn't match the name produced at the parent. Moreover, when it does hit, the pre-18-compatibility code (which has to work with a constraint that has no name) gets involved and uses the UPDATE on pg_constraint using the conkey instead of column name ... and so everything ends up working correctly AFAICS. I think it might cause a problem if the table and column names are overly long, but I didn't want to spend time investigating further. Still, it's wrong code, and static analyzers have twice complained about it, so fix it by adding the array index accessor that was obviously meant. Reported-by: Ranier Vilela <ranier.vf@gmail.com> Reported-by: George Tarasov <george.v.tarasov@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/CAEudQAo7ah=4TDheuEjtb0dsv6bHoK7uBNqv53Tsub2h-xBSJw@mail.gmail.com Discussion: https://postgr.es/m/f3029f25-acc9-4cb9-a74f-fe93bcfb3a27@gmail.com	2026-03-07 14:28:16 +01:00
Jacob Champion	e982331b52	libpq: Introduce PQAUTHDATA_OAUTH_BEARER_TOKEN_V2 For the libpq-oauth module to eventually make use of the PGoauthBearerRequest API, it needs some additional functionality: the derived Issuer ID for the authorization server needs to be provided, and error messages need to be built without relying on PGconn internals. These features seem useful for application hooks, too, so that they don't each have to reinvent the wheel. The original plan was for additions to PGoauthBearerRequest to be made without a version bump to the PGauthData type. Applications would simply check a LIBPQ_HAS_* macro at compile time to decide whether they could use the new features. That theoretically works for applications linked against libpq, since it's not safe to downgrade libpq from the version you've compiled against. We've since found that this strategy won't work for plugins, due to a complication first noticed during the libpq-oauth module split: it's normal for a plugin on disk to be newer than the libpq that's loading it, because you might have upgraded your installation while an application was running. (In other words, a plugin architecture causes the compile-time and run-time dependency arrows to point in opposite directions, so plugins won't be able to rely on the LIBPQ_HAS_* macros to determine what APIs are available to them.) Instead, extend the original PGoauthBearerRequest (now retroactively referred to as "v1" in the code) with a v2 subclass-style struct. When an application implements and accepts PQAUTHDATA_OAUTH_BEARER_TOKEN_V2, it may safely cast the base request pointer it receives in its callbacks to v2 in order to make use of the new functionality. libpq will query the application for a v2 hook first, then v1 to maintain backwards compatibility, before giving up and using the builtin flow. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAOYmi%2BmrGg%2Bn_X2MOLgeWcj3v_M00gR8uz_D7mM8z%3DdX1JYVbg%40mail.gmail.com	2026-03-06 12:05:51 -08:00
Nathan Bossart	b2898baaf7	pg_dumpall: Fix handling of conflicting options. pg_dumpall is missing checks for some conflicting options, including those passed through to pg_dump. To fix, introduce a new function that checks whether mutually exclusive options are set, and use that in pg_dumpall. A similar change could likely be made for pg_dump and pg_restore, but that is left as a future exercise. This is arguably a bug fix, but since this might break existing scripts, no back-patch for now. Author: Jian He <jian.universality@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Wang Peng <215722532@qq.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CACJufxFf5%3DwSv2MsuO8iZOvpLZQ1-meAMwhw7JX5gNvWo5PDug%40mail.gmail.com	2026-03-06 14:00:04 -06:00
Masahiko Sawada	50ea4e09b6	Use palloc_object() and palloc_array() in more areas of the logical replication. The idea is to encourage the use of newer routines across the tree, as these offer stronger type-safety guarantees than raw palloc(). Similar work has been done in commits `1b105f9472`, `0c3c5c3b06`, `31d3847a37`, and `4f7dacc5b8`. This commit extends those changes to more locations within src/backend/replication/logical/. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAHut+Pv4N7Vpxo18+NAR1r9RGvR8b0BtwTkoeCE2PfFoXgmR6A@mail.gmail.com	2026-03-06 10:49:50 -08:00
Tom Lane	415100aa62	Support grouping-expression references and GROUPING() in subqueries. Until now, substitute_grouped_columns and its predecessor check_ungrouped_columns intentionally did not cope with references to GROUP BY expressions (anything more complex than a Var) within subqueries of the query having GROUP BY. Because they didn't try to match subexpressions of subqueries to the GROUP BY list, they'd drill down to raw Vars of the grouping level and then fail with "subquery uses ungrouped column from outer query". There have been remarkably few complaints about this deficiency, so nobody ever did anything about it. The reason for not wanting to deal with it is that within a subquery, Vars will have varlevelsup different from zero and will thus not be equal() to the expressions seen in the outer query. We recognized this at least as far back as `96ca8ffeb`, although I think the comment I added about it then was just documenting a pre-existing deficiency. It looks like at the time, the solutions I considered were (1) write a version of equal() that permits an offset in varlevelsup, or (2) dynamically apply IncrementVarSublevelsUp at each subexpression. (1) would require an amount of new code that seems rather out of proportion to the benefit, while (2) would add an exponential amount of cost to the matching process. But rethinking it now, what seems attractive is (3) apply IncrementVarSublevelsUp to the groupingClause list not the subexpressions, and do so only once per subquery depth level. Then we can still use plain equal() to check for matches, and we're not incurring cost proportional to some power of the subquery's complexity. This patch continues to use the old logic when the GROUP BY list is all Vars. We could discard the special comparison logic for that and always do it the more general way, but that would be a good deal slower. (Micro-benchmarking just parse analysis suggests it's about 50% slower than the Vars-only path. But we've not heard complaints about the speed of matching within the main query, so I doubt that applying the same matching logic within subqueries will be a problem.) The lack of complaints suggests strongly that this is a very minority use-case, so I don't want to make the typical case slower to fix it. While testing that, I was surprised to discover a nearby bug: GROUPING() within a subquery fails to match GROUP BY Vars that are join alias Vars. It tries to apply flatten_join_alias_vars to make such cases work, but that fails to work inside a subquery because varlevelsup is wrong. Therefore, this patch invents a new entry point flatten_join_alias_for_parser() that allows specification of a sublevels_up offset. (It seems cleaner to give the parser its own entry point rather than abuse the planner's conventions even further.) While this is pretty clearly a bug fix, I'm hesitant to take the risk of back-patching, seeing that the existing behavior has stood for so long with so few complaints. Maybe we can reconsider once this patch has baked awhile in master. Reported-by: PALAYRET Jacques <jacques.palayret@meteo.fr> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/531183.1772058731@sss.pgh.pa.us	2026-03-06 13:40:55 -05:00
Jeff Davis	8185bb5347	CREATE SUBSCRIPTION ... SERVER. Allow CREATE SUBSCRIPTION to accept a foreign server using the SERVER clause instead of a raw connection string using the CONNECTION clause. * Enables a user with sufficient privileges to create a subscription using a foreign server by name without specifying the connection details. * Integrates with user mappings (and other FDW infrastructure) using the subscription owner. * Provides a layer of indirection to manage multiple subscriptions to the same remote server more easily. Also add CREATE FOREIGN DATA WRAPPER ... CONNECTION clause to specify a connection_function. To be eligible for a subscription, the foreign server's foreign data wrapper must specify a connection_function. Add connection_function support to postgres_fdw, and bump postgres_fdw version to 1.3. Bump catversion. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/61831790a0a937038f78ce09f8dd4cef7de7456a.camel@j-davis.com	2026-03-06 08:27:56 -08:00

1 2 3 4 5 ...

47322 commits