postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-04-26 08:37:12 -04:00

Author	SHA1	Message	Date
Daniel Gustafsson	970cd5c62b	Fix macro placement in pg_config.h.in Commit `274bbced85` accidentally placed the pg_config.h.in for SSL_CTX_set_num_tickets on the wrong line wrt where autoheader places it. Fix by re-arranging and backpatch to the same level as the original commit. Reported-by: Marina Polyakova <m.polyakova@postgrespro.ru> Discussion: https://postgr.es/m/48cebe8c3eaf308bae253b1dbf4e4a75@postgrespro.ru Backpatch-through: v12	2024-07-26 14:16:40 +02:00
Daniel Gustafsson	118ec331bf	Disable all TLS session tickets OpenSSL supports two types of session tickets for TLSv1.3, stateless and stateful. The option we've used only turns off stateless tickets leaving stateful tickets active. Use the new API introduced in 1.1.1 to disable all types of tickets. Backpatch to all supported versions. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20240617173803.6alnafnxpiqvlh3g@awork3.anarazel.de Backpatch-through: v12	2024-07-26 11:09:45 +02:00
Melanie Plageman	dc6354c670	Ensure vacuum removes all visibly dead tuples older than OldestXmin If vacuum fails to remove a tuple with xmax older than VacuumCutoffs->OldestXmin and younger than GlobalVisState->maybe_needed, it will loop infinitely in lazy_scan_prune(), which compares tuples' visibility information to OldestXmin. Starting in version 14, which uses GlobalVisState for visibility testing during pruning, it is possible for GlobalVisState->maybe_needed to precede OldestXmin if maybe_needed is forced to go backward while vacuum is running. This can happen if a disconnected standby with a running transaction older than VacuumCutoffs->OldestXmin reconnects to the primary after vacuum initially calculates GlobalVisState and OldestXmin. Fix this by having vacuum always remove tuples older than OldestXmin during pruning. This is okay because the standby won't replay the tuple removal until the tuple is removable. Thus, the worst that can happen is a recovery conflict. Fixes BUG# 17257 Back-patched in versions 14-17 Author: Melanie Plageman Reviewed-by: Noah Misch, Peter Geoghegan, Robert Haas, Andres Freund, and Heikki Linnakangas Discussion: https://postgr.es/m/CAAKRu_Y_NJzF4-8gzTTeaOuUL3CcGoXPjXcAHbTTygT8AyVqag%40mail.gmail.com	2024-07-19 12:05:51 -04:00
Masahiko Sawada	aee8c2b954	Fix possibility of logical decoding partial transaction changes. When creating and initializing a logical slot, the restart_lsn is set to the latest WAL insertion point (or the latest replay point on standbys). Subsequently, WAL records are decoded from that point to find the start point for extracting changes in the DecodingContextFindStartpoint() function. Since the initial restart_lsn could be in the middle of a transaction, the start point must be a consistent point where we won't see the data for partial transactions. Previously, when not building a full snapshot, serialized snapshots were restored, and the SnapBuild jumps to the consistent state even while finding the start point. Consequently, the slot's restart_lsn and confirmed_flush could be set to the middle of a transaction. This could lead to various unexpected consequences. Specifically, there were reports of logical decoding decoding partial transactions, and assertion failures occurred because only subtransactions were decoded without decoding their top-level transaction until decoding the commit record. To resolve this issue, the changes prevent restoring the serialized snapshot and jumping to the consistent state while finding the start point. On v17 and HEAD, a flag indicating whether snapshot restores should be skipped has been added to the SnapBuild struct, and SNAPBUILD_VERSION has been bumpded. On backbranches, the flag is stored in the LogicalDecodingContext instead, preserving on-disk compatibility. Backpatch to all supported versions. Reported-by: Drew Callahan Reviewed-by: Amit Kapila, Hayato Kuroda Discussion: https://postgr.es/m/2444AA15-D21B-4CCE-8052-52C7C2DAFE5C%40amazon.com Backpatch-through: 12	2024-07-11 22:48:16 +09:00
Thomas Munro	467d77bb16	Cope with <regex.h> name clashes. macOS 15's SDK pulls in headers related to <regex.h> when we include <xlocale.h>. This causes our own regex_t implementation to clash with the OS's regex_t implementation. Luckily our function names already had pg_ prefixes, but the macros and typenames did not. Include <regex.h> explicitly on all POSIX systems, and fix everything that breaks. Then we can prove that we are capable of fully hiding and replacing the system regex API with our own. 1. Deal with standard-clobbering macros by undefining them all first. POSIX says they are "symbolic constants". If they are macros, this allows us to redefine them. If they are enums or variables, our macros will hide them. 2. Deal with standard-clobbering types by giving our types pg_ prefixes, and then using macros to redirect xxx_t -> pg_xxx_t. After including our "regex/regex.h", the system <regex.h> is hidden, because we've replaced all the standard names. The PostgreSQL source tree and extensions can continue to use standard prefix-less type and macro names, but reach our implementation, if they included our "regex/regex.h" header. Back-patch to all supported branches, so that macOS 15's tool chain can build them. Reported-by: Stan Hu <stanhu@gmail.com> Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Tested-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://postgr.es/m/CAMBWrQnEwEJtgOv7EUNsXmFw2Ub4p5P%2B5QTBEgYwiyjy7rAsEQ%40mail.gmail.com	2024-07-06 10:53:13 +12:00
Noah Misch	2ca8ca4827	Remove comment about xl_heap_inplace "AT END OF STRUCT". Commit `2c03216d83` moved the tuple data from there to the buffer-0 data. Back-patch to v12 (all supported versions), the plan for the next change to this struct. Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com	2024-06-27 19:21:11 -07:00
Noah Misch	b08a4b6163	Cope with inplace update making catcache stale during TOAST fetch. This extends `ad98fb1422` to invals of inplace updates. Trouble requires an inplace update of a catalog having a TOAST table, so only pg_database was at risk. (The other catalog on which core code performs inplace updates, pg_class, has no TOAST table.) Trouble would require something like the inplace-inval.spec test. Consider GRANT ... ON DATABASE fetching a stale row from cache and discarding a datfrozenxid update that vac_truncate_clog() has already relied upon. Back-patch to v12 (all supported versions). Reviewed (in an earlier version) by Robert Haas. Discussion: https://postgr.es/m/20240114201411.d0@rfd.leadboat.com Discussion: https://postgr.es/m/20240512232923.aa.nmisch@google.com	2024-06-27 19:21:11 -07:00
Noah Misch	a338e41374	Lock before setting relhassubclass on RELKIND_PARTITIONED_INDEX. Commit `5b562644fe` added a comment that SetRelationHasSubclass() callers must hold this lock. When commit `17f206fbc8` extended use of this column to partitioned indexes, it didn't take the lock. As the latter commit message mentioned, we currently never reset a partitioned index to relhassubclass=f. That largely avoids harm from the lock omission. The cause for fixing this now is to unblock introducing a rule about locks required to heap_update() a pg_class row. This might cause more deadlocks. It gives minor user-visible benefits: - If an ALTER INDEX SET TABLESPACE runs concurrently with ALTER TABLE ATTACH PARTITION or CREATE PARTITION OF, one transaction blocks instead of failing with "tuple concurrently updated". (Many cases of DDL concurrency still fail that way.) - Match ALTER INDEX ATTACH PARTITION in choosing to lock the index. While not user-visible today, we'll need this if we ever make something set the flag to false for a partitioned index, like ANALYZE does today for tables. Back-patch to v12 (all supported versions), the plan for the commit relying on the new rule. In back branches, add LockOrStrongerHeldByMe() instead of adding a LockHeldByMe() parameter. Reviewed (in an earlier version) by Robert Haas. Discussion: https://postgr.es/m/20240611024525.9f.nmisch@google.com	2024-06-27 19:21:10 -07:00
Heikki Linnakangas	0e2f3d78bf	Fix MVCC bug with prepared xact with subxacts on standby We did not recover the subtransaction IDs of prepared transactions when starting a hot standby from a shutdown checkpoint. As a result, such subtransactions were considered as aborted, rather than in-progress. That would lead to hint bits being set incorrectly, and the subtransactions suddenly becoming visible to old snapshots when the prepared transaction was committed. To fix, update pg_subtrans with prepared transactions's subxids when starting hot standby from a shutdown checkpoint. The snapshots taken from that state need to be marked as "suboverflowed", so that we also check the pg_subtrans. Backport to all supported versions. Discussion: https://www.postgresql.org/message-id/6b852e98-2d49-4ca1-9e95-db419a2696e0@iki.fi	2024-06-27 21:10:31 +03:00
Tom Lane	f550833193	Fix insertion of SP-GiST REDIRECT tuples during REINDEX CONCURRENTLY. Reconstruction of an SP-GiST index by REINDEX CONCURRENTLY may insert some REDIRECT tuples. This will typically happen in a transaction that lacks an XID, which leads either to assertion failure in spgFormDeadTuple or to insertion of a REDIRECT tuple with zero xid. The latter's not good either, since eventually VACUUM will apply GlobalVisTestIsRemovableXid() to the zero xid, resulting in either an assertion failure or a garbage answer. In practice, since REINDEX CONCURRENTLY locks out index scans till it's done, it doesn't matter whether it inserts REDIRECTs or PLACEHOLDERs; and likewise it doesn't matter how soon VACUUM reduces such a REDIRECT to a PLACEHOLDER. So in non-assert builds there's no observable problem here, other than perhaps a little index bloat. But it's not behaving as intended. To fix, remove the failing Assert in spgFormDeadTuple, acknowledging that we might sometimes insert a zero XID; and guard VACUUM's GlobalVisTestIsRemovableXid() call with a test for valid XID, ensuring that we'll reduce such a REDIRECT the first time VACUUM sees it. (Versions before v14 use TransactionIdPrecedes here, which won't fail on zero xid, so they really have no bug at all in non-assert builds.) Another solution could be to not create REDIRECTs at all during REINDEX CONCURRENTLY, making the relevant code paths treat that case like index build (which likewise knows that no concurrent index scans can be happening). That would allow restoring the Assert in spgFormDeadTuple, but we'd still need the VACUUM change because redirection tuples with zero xid may be out there already. But there doesn't seem to be a nice way for spginsert() to tell that it's being called in REINDEX CONCURRENTLY without some API changes, so we'll leave that as a possible future improvement. In HEAD, also rename the SpGistState.myXid field to redirectXid, which seems less misleading (since it might not in fact be our transaction's XID) and is certainly less uninformatively generic. Per bug #18499 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/18499-8a519c280f956480@postgresql.org	2024-06-17 14:30:59 -04:00
Tom Lane	4ac385adc5	Account for optimized MinMax aggregates during SS_finalize_plan. We are capable of optimizing MIN() and MAX() aggregates on indexed columns into subqueries that exploit the index, rather than the normal thing of scanning the whole table. When we do this, we replace the Aggref node(s) with Params referencing subquery outputs. Such Params really ought to be included in the per-plan-node extParam/allParam sets computed by SS_finalize_plan. However, we've never done so up to now because of an ancient implementation choice to perform that substitution during set_plan_references, which runs after SS_finalize_plan, so that SS_finalize_plan never sees these Params. The cleanest fix would be to perform a separate tree walk to do these substitutions before SS_finalize_plan runs. That seems unattractive, first because a whole-tree mutation pass is expensive, and second because we lack infrastructure for visiting expression subtrees in a Plan tree, so that we'd need a new function knowing as much as SS_finalize_plan knows about that. I also considered swapping the order of SS_finalize_plan and set_plan_references, but that fell foul of various assumptions that seem tricky to fix. So the approach adopted here is to teach SS_finalize_plan itself to check for such Aggrefs. I refactored things a bit in setrefs.c to avoid having three copies of the code that does that. Back-patch of v17 commits `d0d44049d` and `779ac2c74`. When `d0d44049d` went in, there was no evidence that it was fixing a reachable bug, so I refrained from back-patching. Now we have such evidence. Per bug #18465 from Hal Takahara. Back-patch to all supported branches. Discussion: https://postgr.es/m/18465-2fae927718976b22@postgresql.org Discussion: https://postgr.es/m/2391880.1689025003@sss.pgh.pa.us	2024-05-18 14:31:35 -04:00
David Rowley	52f21f9287	Ensure we allocate NAMEDATALEN bytes for names in Index Only Scans As an optimization, we store "name" columns as cstrings in btree indexes. Here we modify it so that Index Only Scans convert these cstrings back to names with NAMEDATALEN bytes rather than storing the cstring in the tuple slot, as was happening previously. Bug: #17855 Reported-by: Alexander Lakhin Reviewed-by: Alexander Lakhin, Tom Lane Discussion: https://postgr.es/m/17855-5f523e0f9769a566@postgresql.org Backpatch-through: 12, all supported versions	2024-05-01 13:22:16 +12:00
Andres Freund	dcb7cf945c	simplehash: Free collisions array in SH_STAT While SH_STAT() is only used for debugging, the allocated array can be large, and therefore should be freed. It's unclear why coverity started warning now. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Coverity Discussion: https://postgr.es/m/3005248.1712538233@sss.pgh.pa.us Backpatch: 12-	2024-04-07 19:09:04 -07:00
Alexander Korotkov	c2faf48fa3	Fix the parameters order for TableAmRoutine.relation_copy_for_cluster() Specify OldTable first, NewTable second as used by table_relation_copy_for_cluster() and as implemented in heapam_relation_copy_for_cluster(). Backpatch to PostgreSQL 12, where TableAmRoutine was introduced. Discussion: https://postgr.es/m/ME3P282MB3166860D4911AE82F92DF7C5B63F2%40ME3P282MB3166.AUSP282.PROD.OUTLOOK.COM Author: Japin Li Reviewed-by: Pavel Borisov Backpatch-through: 12	2024-04-03 21:31:06 +03:00
Tom Lane	4fb56a734d	Avoid deadlock during orphan temp table removal. If temp tables have dependencies (such as sequences) then it's possible for autovacuum's cleanup of orphan temp tables to deadlock against an incoming backend that's trying to clean out the temp namespace for its own use. That can happen because RemoveTempRelations' performDeletion call can visit objects within the namespace in an order different from the order in which a per-table deletion will visit them. To fix, observe that performDeletion will begin by taking an exclusive lock on the temp namespace (even though it won't actually delete it). So, if we can get a shared lock on the namespace, we can be sure we're not running concurrently with RemoveTempRelations, while also not conflicting with ordinary use of the namespace. This requires introducing a conditional version of LockDatabaseObject, but that's no big deal. (It's surprising we've got along without that this long.) Report and patch by Mikhail Zhilin. Back-patch to all supported branches. Discussion: https://postgr.es/m/c43ce028-2bc2-4865-9b89-3f706246eed5@postgrespro.ru	2024-04-02 14:59:04 -04:00
Tom Lane	6f66fadad9	Fix confusion about the return rowtype of SQL-language procedures. There is a very ancient hack in check_sql_fn_retval that allows a single SELECT targetlist entry of composite type to be taken as supplying all the output columns of a function returning composite. (This is grotty and fundamentally ambiguous, but it's really hard to do nested composite-returning functions without it.) As far as I know, that doesn't cause any problems in ordinary functions. It's disastrous for procedures however. All procedures that have any output parameters are labeled with prorettype RECORD, and the CALL code expects it will get back a record with one column per output parameter, regardless of whether any of those parameters is composite. Doing something else leads to an assertion failure or core dump. This is simple enough to fix: we just need to not apply that rule when considering procedures. However, that requires adding another argument to check_sql_fn_retval, which at least in principle might be getting called by external callers. Therefore, in the back branches convert check_sql_fn_retval into an ABI-preserving wrapper around a new function check_sql_fn_retval_ext. Per report from Yahor Yuzefovich. This has been broken since we implemented procedures, so back-patch to all supported branches. Discussion: https://postgr.es/m/CABz5gWHSjj2df6uG0NRiDhZ_Uz=Y8t0FJP-_SVSsRsnrQT76Gg@mail.gmail.com	2024-03-12 18:16:10 -04:00
Michael Paquier	e82b64a92c	Revert "Fix parallel-safety check of expressions and predicate for index builds" This reverts commit `eae7be600b`, following a discussion with Tom Lane, due to concerns that this impacts the decisions made by the planner for the number of workers spawned based on the inlining and const-folding of index expressions and predicate for cases that would have worked until this commit. Discussion: https://postgr.es/m/162802.1709746091@sss.pgh.pa.us Backpatch-through: 12	2024-03-07 08:31:04 +09:00
Michael Paquier	50b5913a87	Fix parallel-safety check of expressions and predicate for index builds As coded, the planner logic that calculates the number of parallel workers to use for a parallel index build uses expressions and predicates from the relcache, which are flattened for the planner by eval_const_expressions(). As reported in the bug, an immutable parallel-unsafe function flattened in the relcache would become a Const, which would be considered as parallel-safe, even if the predicate or the expressions including the function are not safe in parallel workers. Depending on the expressions or predicate used, this could cause the parallel build to fail. Tests are included that check parallel index builds with parallel-unsafe predicate and expressions. Two routines are added to lsyscache.h to be able to retrieve expressions and predicate of an index from its pg_index data. Reported-by: Alexander Lakhin Author: Tender Wang Reviewed-by: Jian He, Michael Paquier Discussion: https://postgr.es/m/CAHewXN=UaAaNn9ruHDH3Os8kxLVmtWqbssnf=dZN_s9=evHUFA@mail.gmail.com Backpatch-through: 12	2024-03-06 17:24:06 +09:00
Noah Misch	8fa4a1ac61	Sync PG_VERSION file in CREATE DATABASE. An OS crash could leave PG_VERSION empty or missing. The same symptom appeared in a backup by block device snapshot, taken after the next checkpoint and before the OS flushes the PG_VERSION blocks. Device snapshots are not a documented backup method, however. Back-patch to v15, where commit `9c08aea6a3` introduced STRATEGY=WAL_LOG and made it the default. Discussion: https://postgr.es/m/20240130195003.0a.nmisch@google.com	2024-02-01 13:44:23 -08:00
Nathan Bossart	3726c1cb0e	Move is_valid_ascii() to ascii.h. This function requires simd.h, which is a rather large dependency for a widely-used header file like pg_wchar.h. Furthermore, there is a report of a third-party tool that is struggling to use pg_wchar.h due to its dependence on simd.h (presumably because simd.h uses several intrinsics). Moving the function to the much less popular ascii.h resolves these issues for now. This commit is back-patched for the benefit of the aforementioned third-party tool. The simd.h dependency was only added in v16, but we've opted to back-patch to v15 so that is_valid_ascii() lives in the same file for all versions where it exists. This could break existing third-party code that uses the function, but we couldn't find any examples of such code. It should be possible to fix any code that this commit breaks by including ascii.h in the file that uses is_valid_ascii(). Author: Jubilee Young Reviewed-by: Tom Lane, John Naylor, Andres Freund, Eric Ridge Discussion: https://postgr.es/m/CAPNHn3oKJJxMsYq%2BqLYzVJOFrUcOr4OF1EC-KtFT-qh8nOOOtQ%40mail.gmail.com Backpatch-through: 15	2024-01-29 12:09:08 -06:00
Michael Paquier	1cf2dba84b	Add try_index_open(), conditional variant of index_open() try_index_open() is able to open an index if its relkind fits, except that it would return NULL instead of generated an error if the relation does not exist. This new routine will be used by an upcoming patch to make REINDEX on partitioned relations more robust when an index in a partition tree is dropped. Extracted from a larger patch by the same author. Author: Fei Changhong Discussion: https://postgr.es/m/tencent_6A52106095ACDE55333E3AD33F304C0C3909@qq.com Backpatch-through: 14	2024-01-18 15:04:35 +09:00
Andres Freund	f374fb4aab	lwlock: Fix quadratic behavior with very long wait lists Until now LWLockDequeueSelf() sequentially searched the list of waiters to see if the current proc is still is on the list of waiters, or has already been removed. In extreme workloads, where the wait lists are very long, this leads to a quadratic behavior. #backends iterating over a list #backends long. Additionally, the likelihood of needing to call LWLockDequeueSelf() in the first place also increases with the increased length of the wait queue, as it becomes more likely that a lock is released while waiting for the wait list lock, which is held for longer during lock release. Due to the exponential back-off in perform_spin_delay() this is surprisingly hard to detect. We should make that easier, e.g. by adding a wait event around the pg_usleep() - but that's a separate patch. The fix is simple - track whether a proc is currently waiting in the wait list or already removed but waiting to be woken up in PGPROC->lwWaiting. In some workloads with a lot of clients contending for a small number of lwlocks (e.g. WALWriteLock), the fix can substantially increase throughput. This has been originally fixed for 16~ with `a4adc31f69` without a backpatch, and we have heard complaints from users impacted by this quadratic behavior in older versions as well. Author: Andres Freund <andres@anarazel.de> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/20221027165914.2hofzp4cvutj6gin@awork3.anarazel.de Discussion: https://postgr.es/m/CALj2ACXktNbG=K8Xi7PSqbofTZozavhaxjatVc14iYaLu4Maag@mail.gmail.com Backpatch-through: 12	2024-01-18 11:12:31 +09:00
Tom Lane	5dd30bb54b	Use BIO_{get,set}_app_data instead of BIO_{get,set}_data. We should have done it this way all along, but we accidentally got away with using the wrong BIO field up until OpenSSL 3.2. There, the library's BIO routines that we rely on use the "data" field for their own purposes, and our conflicting use causes assorted weird behaviors up to and including core dumps when SSL connections are attempted. Switch to using the approved field for the purpose, i.e. app_data. While at it, remove our configure probes for BIO_get_data as well as the fallback implementation. BIO_{get,set}_app_data have been there since long before any OpenSSL version that we still support, even in the back branches. Also, update src/test/ssl/t/001_ssltests.pl to allow for a minor change in an error message spelling that evidently came in with 3.2. Tristan Partin and Bo Andreson. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAN55FZ1eDDYsYaL7mv+oSLUij2h_u6hvD4Qmv-7PK7jkji0uyQ@mail.gmail.com	2023-11-28 12:34:03 -05:00
Heikki Linnakangas	2873fbfe0d	Fix assertions with RI triggers in heap_update and heap_delete. If the tuple being updated is not visible to the crosscheck snapshot, we return TM_Updated but the assertions would not hold in that case. Move them to before the cross-check. Fixes bug #17893. Backpatch to all supported versions. Author: Alexander Lakhin Backpatch-through: 12 Discussion: https://www.postgresql.org/message-id/17893-35847009eec517b5%40postgresql.org	2023-11-28 11:59:50 +02:00
Daniel Gustafsson	aef521849b	llvmjit: Use explicit LLVMContextRef for inlining When performing inlining LLVM unfortunately "leaks" types (the types survive and are usable, but a new round of inlining will recreate new structurally equivalent types). This accumulation will over time amount to a memory leak which for some queries can be large enough to trigger the OOM process killer. To avoid accumulation of types, all IR related data is stored in an LLVMContextRef which is dropped and recreated in order to release all types. Dropping and recreating incurs overhead, so it will be done only after 100 queries. This is a heuristic which might be revisited, but until we can get the size of the context from LLVM we are flying a bit blind. This issue has been reported several times, there may be more references to it in the archives on top of the threads linked below. This is a backpatch of `9dce22033d` to all supported branches. Reported-By: Justin Pryzby <pryzby@telsasoft.com> Reported-By: Kurt Roeckx <kurt@roeckx.be> Reported-By: Jaime Casanova <jcasanov@systemguards.com.ec> Reported-By: Lauri Laanmets <pcspets@gmail.com> Author: Andres Freund and Daniel Gustafsson Discussion: https://postgr.es/m/7acc8678-df5f-4923-9cf6-e843131ae89d@www.fastmail.com Discussion: https://postgr.es/m/20201218235607.GC30237@telsasoft.com Discussion: https://postgr.es/m/CAPH-tTxLf44s3CvUUtQpkDr1D8Hxqc2NGDzGXS1ODsfiJ6WSqA@mail.gmail.com Backpatch-through: v12	2023-11-17 10:21:34 +01:00
Tom Lane	9057ddbefe	Ensure we preprocess expressions before checking their volatility. contain_mutable_functions and contain_volatile_functions give reliable answers only after expression preprocessing (specifically eval_const_expressions). Some places understand this, but some did not get the memo --- which is not entirely their fault, because the problem is documented only in places far away from those functions. Introduce wrapper functions that allow doing the right thing easily, and add commentary in hopes of preventing future mistakes from copy-and-paste of code that's only conditionally safe. Two actual bugs of this ilk are fixed here. We failed to preprocess column GENERATED expressions before checking mutability, so that the code could fail to detect the use of a volatile function default-argument expression, or it could reject a polymorphic function that is actually immutable on the datatype of interest. Likewise, column DEFAULT expressions weren't preprocessed before determining if it's safe to apply the attmissingval mechanism. A false negative would just result in an unnecessary table rewrite, but a false positive could allow the attmissingval mechanism to be used in a case where it should not be, resulting in unexpected initial values in a new column. In passing, re-order the steps in ComputePartitionAttrs so that its checks for invalid column references are done before applying expression_planner, rather than after. The previous coding would not complain if a partition expression contains a disallowed column reference that gets optimized away by constant folding, which seems to me to be a behavior we do not want. Per bug #18097 from Jim Keener. Back-patch to all supported versions. Discussion: https://postgr.es/m/18097-ebb179674f22932f@postgresql.org	2023-11-16 10:05:14 -05:00
Nathan Bossart	18f47989ec	Fix fallback implementation for pg_atomic_test_set_flag(). The fallback implementation of pg_atomic_test_set_flag() that uses atomic-exchange gives pg_atomic_exchange_u32_impl() an extra argument. This issue has been present since the introduction of the atomics API in commit `b64d92f1a5`. Reviewed-by: Andres Freund Discussion: https://postgr.es/m/20231114035439.GA1809032%40nathanxps13 Backpatch-through: 12	2023-11-15 15:04:34 -06:00
David Rowley	456d697bae	Ensure we use the correct spelling of "ensure" We seem to have accidentally used "insure" in a few places. Correct that. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pv0biqrhA3pMhu40aDsj343mTsD75khKnHsLqR8P04f=Q@mail.gmail.com Backpatch-through: 12, oldest supported version	2023-11-10 00:17:07 +13:00
Dean Rasheed	308a69a987	Fix corner-case 64-bit integer subtraction bug on some platforms. When computing "0 - INT64_MIN", most platforms would report an overflow error, which is correct. However, platforms without integer overflow builtins or 128-bit integers would fail to spot the overflow, and incorrectly return INT64_MIN. Back-patch to all supported branches. Patch be me. Thanks to Jian He for initial investigation, and Laurenz Albe and Tom Lane for review. Discussion: https://postgr.es/m/CAEZATCUNK-AZSD0jVdgkk0N%3DNcAXBWeAEX-QU9AnJPensikmdQ%40mail.gmail.com	2023-11-09 09:54:22 +00:00
Tom Lane	3bc6bc3ee2	Detect integer overflow while computing new array dimensions. array_set_element() and related functions allow an array to be enlarged by assigning to subscripts outside the current array bounds. While these places were careful to check that the new bounds are allowable, they neglected to consider the risk of integer overflow in computing the new bounds. In edge cases, we could compute new bounds that are invalid but get past the subsequent checks, allowing bad things to happen. Memory stomps that are potentially exploitable for arbitrary code execution are possible, and so is disclosure of server memory. To fix, perform the hazardous computations using overflow-detecting arithmetic routines, which fortunately exist in all still-supported branches. The test cases added for this generate (after patching) errors that mention the value of MaxArraySize, which is platform-dependent. Rather than introduce multiple expected-files, use psql's VERBOSITY parameter to suppress the printing of the message text. v11 psql lacks that parameter, so omit the tests in that branch. Our thanks to Pedro Gallegos for reporting this problem. Security: CVE-2023-5869	2023-11-06 10:56:43 -05:00
Tom Lane	ae33659d42	Be more wary about NULL values for GUC string variables. get_explain_guc_options() crashed if a string GUC marked GUC_EXPLAIN has a NULL boot_val. Nosing around found a couple of other places that seemed insufficiently cautious about NULL string values, although those are likely unreachable in practice. Add some commentary defining the expectations for NULL values of string variables, in hopes of forestalling future additions of more such bugs. Xing Guo, Aleksander Alekseev, Tom Lane Discussion: https://postgr.es/m/CACpMh+AyDx5YUpPaAgzVwC1d8zfOL4JoD-uyFDnNSa1z0EsDQQ@mail.gmail.com	2023-11-02 11:47:33 -04:00
Tom Lane	1268e73781	Fix problems when a plain-inheritance parent table is excluded. When an UPDATE/DELETE/MERGE's target table is an old-style inheritance tree, it's possible for the parent to get excluded from the plan while some children are not. (I believe this is only possible if we can prove that a CHECK ... NO INHERIT constraint on the parent contradicts the query WHERE clause, so it's a very unusual case.) In such a case, ExecInitModifyTable mistakenly concluded that the first surviving child is the target table, leading to at least two bugs: 1. The wrong table's statement-level triggers would get fired. 2. In v16 and up, it was possible to fail with "invalid perminfoindex 0 in RTE with relid nnnn" due to the child RTE not having permissions data included in the query plan. This was hard to reproduce reliably because it did not occur unless the update triggered some non-HOT index updates. In v14 and up, this is easy to fix by defining ModifyTable.rootRelation to be the parent RTE in plain inheritance as well as partitioned cases. While the wrong-triggers bug also appears in older branches, the relevant code in both the planner and executor is quite a bit different, so it would take a good deal of effort to develop and test a suitable patch. Given the lack of field complaints about the trigger issue, I'll desist for now. (Patching v11 for this seems unwise anyway, given that it will have no more releases after next month.) Per bug #18147 from Hans Buschmann. Amit Langote and Tom Lane Discussion: https://postgr.es/m/18147-6fc796538913ee88@postgresql.org	2023-10-24 14:48:34 -04:00
Thomas Munro	b2e0977886	jit: Supply LLVMGlobalGetValueType() for LLVM < 8. Commit `37d5babb` used this C API function while adding support for LLVM 16 and opaque pointers, but it's not available in LLVM 7 and older. Provide it in our own llvmjit_wrap.cpp. It just calls a C++ function that pre-dates LLVM 3.9, our minimum target. Back-patch to 12, like `37d5babb`. Discussion: https://postgr.es/m/CA%2BhUKGKnLnJnWrkr%3D4mSGhE5FuTK55FY15uULR7%3Dzzc%3DwX4Nqw%40mail.gmail.com	2023-10-19 03:03:27 +13:00
Thomas Munro	eed1feb3fe	jit: Support opaque pointers in LLVM 16. Remove use of LLVMGetElementType() and provide the type of all pointers to LLVMBuildXXX() functions when emitting IR, as required by modern LLVM versions[1]. * For LLVM <= 14, we'll still use the old LLVMBuildXXX() functions. * For LLVM == 15, we'll continue to do the same, explicitly opting out of opaque pointer mode. * For LLVM >= 16, we'll use the new LLVMBuildXXX2() functions that take the extra type argument. The difference is hidden behind some new IR emitting wrapper functions l_load(), l_gep(), l_call() etc. The change is mostly mechanical, except that at each site the correct type had to be provided. In some places we needed to do some extra work to get functions types, including some new wrappers for C++ APIs that are not yet exposed by in LLVM's C API, and some new "example" functions in llvmjit_types.c because it's no longer possible to start from the function pointer type and ask for the function type. Back-patch to 12, because it's a little tricker in 11 and we agreed not to put the latest LLVM support into the upcoming final release of 11. [1] https://llvm.org/docs/OpaquePointers.html Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Ronan Dunklau <ronan.dunklau@aiven.io> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA%2BhUKGKNX_%3Df%2B1C4r06WETKTq0G4Z_7q4L4Fxn5WWpMycDj9Fw%40mail.gmail.com	2023-10-18 22:59:46 +13:00
Nathan Bossart	c9265ae80b	Avoid calling proc_exit() in processes forked by system(). The SIGTERM handler for the startup process immediately calls proc_exit() for the duration of the restore_command, i.e., a call to system(). This system() call forks a new process to execute the shell command, and this child process inherits the parent's signal handlers. If both the parent and child processes receive SIGTERM, both will attempt to call proc_exit(). This can end badly. For example, both processes will try to remove themselves from the PGPROC shared array. To fix this problem, this commit adds a check in StartupProcShutdownHandler() to see whether MyProcPid == getpid(). If they match, this is the parent process, and we can proc_exit() like before. If they do not match, this is a child process, and we just emit a message to STDERR (in a signal safe manner) and _exit(), thereby skipping any problematic exit callbacks. This commit also adds checks in proc_exit(), ProcKill(), and AuxiliaryProcKill() that verify they are not being called within such child processes. Suggested-by: Andres Freund Reviewed-by: Thomas Munro, Andres Freund Discussion: https://postgr.es/m/Y9nGDSgIm83FHcad%40paquier.xyz Discussion: https://postgr.es/m/20230223231503.GA743455%40nathanxps13 Backpatch-through: 11	2023-10-17 10:42:06 -05:00
Noah Misch	782be0f712	Dissociate btequalimage() from interval_ops, ending its deduplication. Under interval_ops, some equal values are distinguishable. One such pair is '24:00:00' and '1 day'. With that being so, btequalimage() breaches the documented contract for the "equalimage" btree support function. This can cause incorrect results from index-only scans. Users should REINDEX any btree indexes having interval-type columns. After updating, pg_amcheck will report an error for almost all such indexes. This fix makes interval_ops simply omit the support function, like numeric_ops does. Back-pack to v13, where btequalimage() first appeared. In back branches, for the benefit of old catalog content, btequalimage() code will return false for type "interval". Going forward, back-branch initdb will include the catalog change. Reviewed by Peter Geoghegan. Discussion: https://postgr.es/m/20231011013317.22.nmisch@google.com	2023-10-14 16:33:54 -07:00
Dean Rasheed	3c1a1af91d	Fix EvalPlanQual rechecking during MERGE. Under some circumstances, concurrent MERGE operations could lead to inconsistent results, that varied according the plan chosen. This was caused by a lack of rowmarks on the source relation, which meant that EvalPlanQual rechecking was not guaranteed to return the same source tuples when re-running the join query. Fix by ensuring that preprocess_rowmarks() sets up PlanRowMarks for all non-target relations used in MERGE, in the same way that it does for UPDATE and DELETE. Per bug #18103. Back-patch to v15, where MERGE was introduced. Dean Rasheed, reviewed by Richard Guo. Discussion: https://postgr.es/m/18103-c4386baab8e355e3%40postgresql.org	2023-09-30 10:55:24 +01:00
Peter Geoghegan	cac37c1a1b	Fix btmarkpos/btrestrpos array key wraparound bug. nbtree's mark/restore processing failed to correctly handle an edge case involving array key advancement and related search-type scan key state. Scans with ScalarArrayScalarArrayOpExpr quals requiring mark/restore processing (for a merge join) could incorrectly conclude that an affected array/scan key must not have advanced during the time between marking and restoring the scan's position. As a result of all this, array key handling within btrestrpos could skip a required call to _bt_preprocess_keys(). This confusion allowed later primitive index scans to overlook tuples matching the true current array keys. The scan's search-type scan keys would still have spurious values corresponding to the final array element(s) -- not values matching the first/now-current array element(s). To fix, remember that "array key wraparound" has taken place during the ongoing btrescan in a flag variable stored in the scan's state, and use that information at the point where btrestrpos decides if another call to _bt_preprocess_keys is required. Oversight in commit `70bc5833`, which taught nbtree to handle array keys during mark/restore processing, but missed this subtlety. That commit was itself a bug fix for an issue in commit `9e8da0f7`, which taught nbtree to handle ScalarArrayOpExpr quals natively. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-WzkgP3DDRJxw6DgjCxo-cu-DKrvjEv_ArkP2ctBJatDCYg@mail.gmail.com Backpatch: 11- (all supported branches).	2023-09-28 16:29:32 -07:00
Tom Lane	8700851352	Avoid unnecessary plancache revalidation of utility statements. Revalidation of a plancache entry (after a cache invalidation event) requires acquiring a snapshot. Normally that is harmless, but not if the cached statement is one that needs to run without acquiring a snapshot. We were already aware of that for TransactionStmts, but for some reason hadn't extrapolated to the other statements that PlannedStmtRequiresSnapshot() knows mustn't set a snapshot. This can lead to unexpected failures of commands such as SET TRANSACTION ISOLATION LEVEL. We can fix it in the same way, by excluding those command types from revalidation. However, we can do even better than that: there is no need to revalidate for any statement type for which parse analysis, rewrite, and plan steps do nothing interesting, which is nearly all utility commands. To mechanize this, invent a parser function stmt_requires_parse_analysis() that tells whether parse analysis does anything beyond wrapping a CMD_UTILITY Query around the raw parse tree. If that's what it does, then rewrite and plan will just skip the Query, so that it is not possible for the same raw parse tree to produce a different plan tree after cache invalidation. stmt_requires_parse_analysis() is basically equivalent to the existing function analyze_requires_snapshot(), except that for obscure reasons that function omits ReturnStmt and CallStmt. It is unclear whether those were oversights or intentional. I have not been able to demonstrate a bug from not acquiring a snapshot while analyzing these commands, but at best it seems mighty fragile. It seems safer to acquire a snapshot for parse analysis of these commands too, which allows making stmt_requires_parse_analysis and analyze_requires_snapshot equivalent. In passing this fixes a second bug, which is that ResetPlanCache would exclude ReturnStmts and CallStmts from revalidation. That's surely not safe, since they contain parsable expressions. Per bug #18059 from Pavel Kulakov. Back-patch to all supported branches. Discussion: https://postgr.es/m/18059-79c692f036b25346@postgresql.org	2023-08-24 12:02:40 -04:00
Peter Eisentraut	908f711d23	Update DECLARE_INDEX documentation Update source code comment changes belonging to the changes in `6a6389a08b`. Discussion: https://www.postgresql.org/message-id/flat/75ae5875-3abc-dafc-8aec-73247ed41cde@eisentraut.org	2023-08-24 14:00:54 +02:00
Etsuro Fujita	9f495270cc	Update comments on CustomPath struct. Commit `e7cb7ee14` allowed custom scan providers to create CustomPath paths for join relations as well, but missed updating the comments. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAPmGK15ODkN%2B%3DhkBCufj1HBW0x5OTb65Xuy7ryXchMdiCMpx_g%40mail.gmail.com	2023-08-03 17:15:02 +09:00
Etsuro Fujita	d1ef5631e6	Disallow replacing joins with scans in problematic cases. Commit `e7cb7ee14`, which introduced the infrastructure for FDWs and custom scan providers to replace joins with scans, failed to add support handling of pseudoconstant quals assigned to replaced joins in createplan.c, leading to an incorrect plan without a gating Result node when postgres_fdw replaced a join with such a qual. To fix, we could add the support by 1) modifying the ForeignPath and CustomPath structs to store the list of RestrictInfo nodes to apply to the join, as in JoinPaths, if they represent foreign and custom scans replacing a join with a scan, and by 2) modifying create_scan_plan() in createplan.c to use that list in that case, instead of the baserestrictinfo list, to get pseudoconstant quals assigned to the join; but #1 would cause an ABI break. So fix by modifying the infrastructure to just disallow replacing joins with such quals. Back-patch to all supported branches. Reported by Nishant Sharma. Patch by me, reviewed by Nishant Sharma and Richard Guo. Discussion: https://postgr.es/m/CADrsxdbcN1vejBaf8a%2BQhrZY5PXL-04mCd4GDu6qm6FigDZd6Q%40mail.gmail.com	2023-07-28 15:45:02 +09:00
Amit Kapila	2864eb977a	Fix the display of UNKNOWN message type in apply worker. We include the message type while displaying an error context in the apply worker. Now, while retrieving the message type string if the message type is unknown we throw an error that will hide the original error. So, instead, we need to simply return the string indicating an unknown message type. Reported-by: Ashutosh Bapat Author: Euler Taveira, Amit Kapila Reviewed-by: Ashutosh Bapat Backpatch-through: 15 Discussion: https://postgr.es/m/CAExHW5suAEDW-mBZt_qu4RVxWZ1vL54-L+ci2zreYWebpzxYsA@mail.gmail.com	2023-07-25 08:50:37 +05:30
Andres Freund	f66403749d	Handle DROP DATABASE getting interrupted Until now, when DROP DATABASE got interrupted in the wrong moment, the removal of the pg_database row would also roll back, even though some irreversible steps have already been taken. E.g. DropDatabaseBuffers() might have thrown out dirty buffers, or files could have been unlinked. But we continued to allow connections to such a corrupted database. To fix this, mark databases invalid with an in-place update, just before starting to perform irreversible steps. As we can't add a new column in the back branches, we use pg_database.datconnlimit = -2 for this purpose. An invalid database cannot be connected to anymore, but can still be dropped. Unfortunately we can't easily add output to psql's \l to indicate that some database is invalid, it doesn't fit in any of the existing columns. Add tests verifying that a interrupted DROP DATABASE is handled correctly in the backend and in various tools. Reported-by: Evgeny Morozov <postgresql3@realityexists.net> Author: Andres Freund <andres@anarazel.de> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/20230509004637.cgvmfwrbht7xm7p6@awork3.anarazel.de Discussion: https://postgr.es/m/20230314174521.74jl6ffqsee5mtug@awork3.anarazel.de Backpatch: 11-, bug present in all supported versions	2023-07-13 13:04:45 -07:00
Andres Freund	d12d1a9278	Fix type of iterator variable in SH_START_ITERATE Also add comment to make the reasoning behind the Assert() more explicit (per Tom). Reported-by: Ranier Vilela Discussion: https://postgr.es/m/CAEudQAocXNJ6s1VLz+hMamLAQAiewRoW17OJ6-+9GACKfj6iPQ@mail.gmail.com Backpatch: 11-	2023-07-06 09:57:31 -07:00
Tom Lane	4729d1e8aa	Fix misbehavior of EvalPlanQual checks with multiple result relations. The idea of EvalPlanQual is that we replace the query's scan of the result relation with a single injected tuple, and see if we get a tuple out, thereby implying that the injected tuple still passes the query quals. (In join cases, other relations in the query are still scanned normally.) This logic was not updated when commit `86dc90056` made it possible for a single DML query plan to have multiple result relations, when the query target relation has inheritance or partition children. We replaced the output for the current result relation successfully, but other result relations were still scanned normally; thus, if any other result relation contained a tuple satisfying the quals, we'd think the EPQ check passed, even if it did not pass for the injected tuple itself. This would lead to update or delete actions getting performed when they should have been skipped due to a conflicting concurrent update in READ COMMITTED isolation mode. Fix by blocking all sibling result relations from emitting tuples during an EvalPlanQual recheck. In the back branches, the fix is complicated a bit by the need to not change the size of struct EPQState (else we'd have ABI-breaking changes in offsets in struct ModifyTableState). Like the back-patches of `3f7836ff6` and `4b3e37993`, add a separately palloc'd struct to avoid that. The logic is the same as in HEAD otherwise. This is only a live bug back to v14 where `86dc90056` came in. However, I chose to back-patch the test cases further, on the grounds that this whole area is none too well tested. I skipped doing so in v11 though because none of the test applied cleanly, and it didn't quite seem worth extra work for a branch with only six months to live. Per report from Ante Krešić (via Aleksander Alekseev) Discussion: https://postgr.es/m/CAJ7c6TMBTN3rcz4=AjYhLPD_w3FFT0Wq_C15jxCDn8U4tZnH1g@mail.gmail.com	2023-05-19 14:26:34 -04:00
Michael Paquier	2dd7782217	pageinspect: Fix gist_page_items() with included columns Non-leaf pages of GiST indexes contain key attributes, leaf pages contain both key and non-key attributes, and gist_page_items() ignored the handling of non-key attributes. This caused a few problems when using gist_page_items() on a GiST index with INCLUDE: - On a non-leaf page, the function would crash. - On a leaf page, the function would work, but miss to display all the values for included attributes. This commit fixes gist_page_items() to handle such cases in a more appropriate way, and now displays the values of key and non-key attributes for each item separately in a style consistent with what ruleutils.c would generate for the attribute list, depending on the page type dealt with. In a way similar to how a record is displayed, values would be double-quoted for key or non-key attributes if required. ruleutils.c did not provide a routine able to control if non-key attributes should be displayed, so an extended() routine for index definitions is added to work around the leaf and non-leaf page differences. While on it, this commit fixes a third problem related to the amount of data reported for key attributes. The code originally relied on BuildIndexValueDescription() (used for error reports on constraints) that would not print all the data stored in the index but the index opclass's input type, so this limited the amount of information available. This switch makes gist_page_items() much cheaper as there is no need to run ACL checks for each item printed, which is not an issue anyway as superuser rights are required to execute the functions of pageinspect. Opclasses whose data cannot be displayed can rely on gist_page_items_bytea(). The documentation of this function was slightly incorrect for the output results generated on HEAD and v15, so adjust it on these branches. Author: Alexander Lakhin, Michael Paquier Discussion: https://postgr.es/m/17884-cb8c326522977acb@postgresql.org Backpatch-through: 14	2023-05-19 12:38:15 +09:00
Tomas Vondra	e187693239	Fix handling of empty ranges and NULLs in BRIN BRIN indexes did not properly distinguish between summaries for empty (no rows) and all-NULL ranges, treating them as essentially the same thing. Summaries were initialized with allnulls=true, and opclasses simply reset allnulls to false when processing the first non-NULL value. This however produces incorrect results if the range starts with a NULL value (or a sequence of NULL values), in which case we forget the range contains NULL values when adding the first non-NULL value. This happens because the allnulls flag is used for two separate purposes - to mark empty ranges (not representing any rows yet) and ranges containing only NULL values. Opclasses don't know which of these cases it is, and so don't know whether to set hasnulls=true. Setting the flag in both cases would make it correct, but it would also make BRIN indexes useless for queries with IS NULL clauses. All ranges start empty (and thus allnulls=true), so all ranges would end up with either allnulls=true or hasnulls=true. The severity of the issue is somewhat reduced by the fact that it only happens when adding values to an existing summary with allnulls=true. This can happen e.g. for small tables (because a summary for the first range exists for all BRIN indexes), or for tables with large fraction of NULL values in the indexed columns. Bulk summarization (e.g. during CREATE INDEX or automatic summarization) that processes all values at once is not affected by this issue. In this case the flags were updated in a slightly different way, not forgetting the NULL values. To identify empty ranges we use a new flag, stored in an unused bit in the BRIN tuple header so the on-disk format remains the same. A matching flag is added to BrinMemTuple, into a 3B gap after bt_placeholder. That means there's no risk of ABI breakage, although we don't actually pass the BrinMemTuple to any public API. We could also skip storing index tuples for empty summaries, but then we'd have to always process such ranges - even if there are no rows in large parts of the table (e.g. after a bulk DELETE), it would still require reading the pages etc. So we store them, but ignore them when building the bitmap. Backpatch to 11. The issue exists since BRIN indexes were introduced in 9.5, but older releases are already EOL. Backpatch-through: 11 Reviewed-by: Justin Pryzby, Matthias van de Meent, Alvaro Herrera Discussion: https://postgr.es/m/402430e4-7d9d-6cf1-09ef-464d80afff3b@enterprisedb.com	2023-05-19 00:15:13 +02:00
Michael Paquier	ccd21e1cfa	Fix assertion failure when updating stats_fetch_consistency in a transaction An update of the GUC stats_fetch_consistency in a transaction would be able to trigger an assertion when doing cache->snapshot. In this case, when retrieving a pgstat entry after the switch, a new snapshot would be rebuilt, confusing pgstat_build_snapshot() because a snapshot is already cached with an unexpected mode ("cache"). In order to fix this problem, this commit adds a flag to force a snapshot clear each time this GUC is changed. Some tests are added to check, while on it. Some optimizations in avoiding the snapshot clear should be possible depending on what is cached and the current GUC value, I guess, but this solution is simple, and ensures that the state of the cache is updated each time a new pgstat entry is fetched, hence being consistent with the level wanted by the client that has set the GUC. Note that cache->none and snapshot->none would not cause issues, as fetching a pgstat entry would be retrieved from shared memory on the second attempt, however a snapshot would still be cached. Similarly, none->snapshot and none->cache would build a new snapshot on the second fetch attempt. Finally, snapshot->cache would cache a new snapshot on the second attempt. Reported-by: Alexander Lakhin Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/17804-2a118cd046f2d0e5@postgresql.org backpatch-through: 15	2023-05-10 11:24:40 +09:00
Michael Paquier	b9ad73ad25	Fix crashes with CREATE SCHEMA AUTHORIZATION and schema elements CREATE SCHEMA AUTHORIZATION with appended schema elements can lead to crashes when comparing the schema name of the query with the schemas used in the qualification of some clauses in the elements' queries. The origin of the problem is that the transformation routine for the elements listed in a CREATE SCHEMA query uses as new, expected, schema name the one listed in CreateSchemaStmt itself. However, depending on the query, CreateSchemaStmt.schemaname may be NULL, being computed instead from the role specification of the query given by the AUTHORIZATION clause, that could be either: - A user name string, with the new schema name being set to the same value as the role given. - Guessed from CURRENT_ROLE, SESSION_ROLE or CURRENT_ROLE, with a new schema name computed from the security context where CREATE SCHEMA is running. Regression tests are added for CREATE SCHEMA with some appended elements (some of them with schema qualifications), covering also some role specification patterns. While on it, this simplifies the context structure used during the transformation of the elements listed in a CREATE SCHEMA query by removing the fields for the role specification and the role type. They were not used, and for the role specification this could be confusing as the schema name may by extracted from that at the beginning of CreateSchemaCommand(). This issue exists for a long time, so backpatch down to all the versions supported. Reported-by: Song Hongyu Author: Michael Paquier Reviewed-by: Richard Guo Discussion: https://postgr.es/m/17909-f65c12dfc5f0451d@postgresql.org Backpatch-through: 11	2023-04-28 19:29:36 +09:00
Michael Paquier	5c32549460	Fix detection of unseekable files for fseek() and ftello() with MSVC Calling fseek() or ftello() on a handle to a non-seeking device such as a pipe or a communications device is not supported. Unfortunately, MSVC's flavor of these routines, _fseeki64() and _ftelli64(), do not return an error when given a pipe as handle. Some of the logic of pg_dump and restore relies on these routines to check if a handle is seekable, causing failures when passing the contents of pg_dump to pg_restore through a pipe, for example. This commit introduces wrappers for fseeko() and ftello() on MSVC so as any callers are able to properly detect the cases of non-seekable handles. This relies mainly on GetFileType(), sharing a bit of code with the MSVC port for fstat(). The code in charge of getting a file type is refactored into a new file called win32common.c, shared by win32stat.c and the new win32fseek.c. It includes the MSVC ports for fseeko() and ftello(). Like `765f5df`, this is backpatched down to 14, where the fstat() implementation for MSVC is able to understand about files larger than 4GB in size. Using a TAP test for that is proving to be tricky as IPC::Run handles the pipes by itself, still I have been able to check the fix manually. Reported-by: Daniel Watzinger Author: Juan José Santamaría Flecha, Michael Paquier Discussion: https://postgr.es/m/CAC+AXB26a4EmxM2suXxPpJaGrqAdxracd7hskLg-zxtPB50h7A@mail.gmail.com Backpatch-through: 14	2023-04-12 09:09:53 +09:00
Thomas Munro	c03c6e8cf6	Fix race in parallel hash join batch cleanup, take II. With unlucky timing and parallel_leader_participation=off (not the default), PHJ could attempt to access per-batch shared state just as it was being freed. There was code intended to prevent that by checking for a cleared pointer, but it was racy. Fix, by introducing an extra barrier phase. The new phase PHJ_BUILD_RUNNING means that it's safe to access the per-batch state to find a batch to help with, and PHJ_BUILD_DONE means that it is too late. The last to detach will free the array of per-batch state as before, but now it will also atomically advance the phase, so that late attachers can avoid the hazard. This mirrors the way per-batch hash tables are freed (see phases PHJ_BATCH_PROBING and PHJ_BATCH_DONE). An earlier attempt to fix this (commit `3b8981b6`, later reverted) missed one special case. When the inner side is empty (the "empty inner optimization), the build barrier would only make it to PHJ_BUILD_HASHING_INNER phase before workers attempted to detach from the hashtable. In that case, fast-forward the build barrier to PHJ_BUILD_RUNNING before proceeding, so that our later assertions hold and we can still negotiate who is cleaning up. Revealed by build farm failures, where BarrierAttach() failed a sanity check assertion, because the memory had been clobbered by dsa_free(). In non-assert builds, the result could be a segmentation fault. Back-patch to all supported releases. Author: Thomas Munro <thomas.munro@gmail.com> Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Michael Paquier <michael@paquier.xyz> Reported-by: David Geier <geidav.pg@gmail.com> Tested-by: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/20200929061142.GA29096%40paquier.xyz	2023-03-21 14:32:14 +13:00
Jeff Davis	8b87e92919	Fix t_isspace(), etc., when datlocprovider=i and datctype=C. Check whether the datctype is C to determine whether t_isspace() and related functions use isspace() or iswspace(). Previously, t_isspace() checked whether the database default collation was C; which is incorrect when the default collation uses the ICU provider. Discussion: https://postgr.es/m/79e4354d9eccfdb00483146a6b9f6295202e7890.camel@j-davis.com Reviewed-by: Peter Eisentraut Backpatch-through: 15	2023-03-17 12:07:47 -07:00
Dean Rasheed	7d9a75713a	Fix concurrent update issues with MERGE. If MERGE attempts an UPDATE or DELETE on a table with BEFORE ROW triggers, or a cross-partition UPDATE (with or without triggers), and a concurrent UPDATE or DELETE happens, the merge code would fail. In some cases this would lead to a crash, while in others it would cause the wrong merge action to be executed, or no action at all. The immediate cause of the crash was the trigger code calling ExecGetUpdateNewTuple() as part of the EPQ mechanism, which fails because during a merge ri_projectNew is NULL, since merge has its own per-action projection information, which ExecGetUpdateNewTuple() knows nothing about. Fix by arranging for the trigger code to exit early, returning the TM_Result and TM_FailureData information, if a concurrent modification is detected, allowing the merge code to do the necessary EPQ handling in its own way. Similarly, prevent the cross-partition update code from doing any EPQ processing for a merge, allowing the merge code to work out what it needs to do. This leads to a number of simplifications in nodeModifyTable.c. Most notably, the ModifyTableContext->GetUpdateNewTuple() callback is no longer needed, and mergeGetUpdateNewTuple() can be deleted, since there is no longer any requirement for get-update-new-tuple during a merge. Similarly, ModifyTableContext->cpUpdateRetrySlot is no longer needed. Thus ExecGetUpdateNewTuple() and the retry_slot handling of ExecCrossPartitionUpdate() can be restored to how they were in v14, before the merge code was added, and ExecMergeMatched() no longer needs any special-case handling for cross-partition updates. While at it, tidy up ExecUpdateEpilogue() a bit, making it handle recheckIndexes locally, rather than passing it in as a parameter, ensuring that it is freed properly. This dates back to when it was split off from ExecUpdate() to support merge. Per bug #17809 from Alexander Lakhin, and follow-up investigation of bug #17792, also from Alexander Lakhin. Back-patch to v15, where MERGE was introduced, taking care to preserve backwards-compatibility of the trigger API in v15 for any extensions that might use it. Discussion: https://postgr.es/m/17809-9e6650bef133f0fe%40postgresql.org https://postgr.es/m/17792-0f89452029662c36%40postgresql.org	2023-03-13 10:23:42 +00:00
Tom Lane	f61e60102f	Avoid failure when altering state of partitioned foreign-key triggers. Beginning in v15, if you apply ALTER TABLE ENABLE/DISABLE TRIGGER to a partitioned table, it also affects the partitions' cloned versions of the affected trigger(s). The initial implementation of this located the clones by name, but that fails on foreign-key triggers which have names incorporating their own OIDs. We can fix that, and also make the behavior more bulletproof in the face of user-initiated trigger renames, by identifying the cloned triggers by tgparentid. Following the lead of earlier commits in this area, I took care not to break ABI in the v15 branch, even though I rather doubt there are any external callers of EnableDisableTrigger. While here, update the documentation, which was not touched when the semantics were changed. Per bug #17817 from Alan Hodgson. Back-patch to v15; older versions do not have this behavior. Discussion: https://postgr.es/m/17817-31dfb7c2100d9f3d@postgresql.org	2023-03-04 13:32:35 -05:00
Tom Lane	a033f9165c	Fix MULTIEXPR_SUBLINK with partitioned target tables, yet again. We already tried to fix this in commits `3f7323cbb` et al (and follow-on fixes), but now it emerges that there are still unfixed cases; moreover, these cases affect all branches not only pre-v14. I thought we had eliminated all cases of making multiple clones of an UPDATE's target list when we nuked inheritance_planner. But it turns out we still do that in some partitioned-UPDATE cases, notably including INSERT ... ON CONFLICT UPDATE, because ExecInitPartitionInfo thinks it's okay to clone and modify the parent's targetlist. This fix is based on a suggestion from Andres Freund: let's stop abusing the ParamExecData.execPlan mechanism, which was only ever meant to handle initplans, and instead solve the execution timing problem by having the expression compiler move MULTIEXPR_SUBLINK steps to the front of their expression step lists. This is feasible because (a) all branches still in support compile the entire targetlist of an UPDATE into a single ExprState, and (b) we know that all MULTIEXPR_SUBLINKs do need to be evaluated --- none could be buried inside a CASE, for example. There is a minor semantics change concerning the order of execution of the MULTIEXPR's subquery versus other parts of the parent targetlist, but that seems like something we can get away with. By doing that, we no longer need to worry about whether different clones of a MULTIEXPR_SUBLINK share output Params; their usage of that data structure won't overlap. Per bug #17800 from Alexander Lakhin. Back-patch to all supported branches. In v13 and earlier, we can revert `3f7323cbb` and follow-on fixes; however, I chose to keep the SubPlan.subLinkId field added in `ccbb54c72`. We don't need that anymore in the core code, but it's cheap enough to fill, and removing a plan node field in a minor release seems like it'd be asking for trouble. Andres Freund and Tom Lane Discussion: https://postgr.es/m/17800-ff90866b3906c964@postgresql.org	2023-02-25 14:44:14 -05:00
Michael Paquier	5fd61055ea	Fix handling of SCRAM-SHA-256's channel binding with RSA-PSS certificates OpenSSL 1.1.1 and newer versions have added support for RSA-PSS certificates, which requires the use of a specific routine in OpenSSL to determine which hash function to use when compiling it when using channel binding in SCRAM-SHA-256. X509_get_signature_nid(), that is the original routine the channel binding code has relied on, is not able to determine which hash algorithm to use for such certificates. However, X509_get_signature_info(), new to OpenSSL 1.1.1, is able to do it. This commit switches the channel binding logic to rely on X509_get_signature_info() over X509_get_signature_nid(), which would be the choice when building with 1.1.1 or newer. The error could have been triggered on the client or the server, hence libpq and the backend need to have their related code paths patched. Note that attempting to load an RSA-PSS certificate with OpenSSL 1.1.0 or older leads to a failure due to an unsupported algorithm. The discovery of relying on X509_get_signature_info() comes from Jacob, the tests have been written by Heikki (with few tweaks from me), while I have bundled the whole together while adding the bits needed for MSVC and meson. This issue exists since channel binding exists, so backpatch all the way down. Some tests are added in 15~, triggered if compiling with OpenSSL 1.1.1 or newer, where the certificate and key files can easily be generated for RSA-PSS. Reported-by: Gunnar "Nick" Bluth Author: Jacob Champion, Heikki Linnakangas Discussion: https://postgr.es/m/17760-b6c61e752ec07060@postgresql.org Backpatch-through: 11	2023-02-15 10:12:31 +09:00
Robert Haas	ecb01e6ebb	Un-revert "Disable STARTUP_PROGRESS_TIMEOUT in standby mode." This reverts commit `1eadfbdd7e` and thus reinstates commit `98e7234242`. It's a better time to commit this now that the release is over. Discussion: http://postgr.es/m/3509384.1675878203@sss.pgh.pa.us	2023-02-10 16:27:05 -05:00
Robert Haas	1eadfbdd7e	Revert "Disable STARTUP_PROGRESS_TIMEOUT in standby mode." This reverts commit `98e7234242`. I forgot that we're about to wrap a release, and this fix isn't critical enough to justify committing it right before we wrap a release. Discussion: http://postgr.es/m/2676424.1675700113@sss.pgh.pa.us	2023-02-06 11:16:03 -05:00
Robert Haas	98e7234242	Disable STARTUP_PROGRESS_TIMEOUT in standby mode. In standby mode, we don't actually report progress of recovery, but up until now, startup_progress_timeout_handler() nevertheless got called every log_startup_progress_interval seconds. That's an unnecessary expense, so avoid it. Report by Thomas Munro. Patch by Bharath Rupireddy, reviewed by Simon Riggs, Thomas Munro, and me. Back-patch to v15, where the problem was introduced. Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com	2023-02-06 10:55:42 -05:00
Tom Lane	9a40a03119	Make our back branches build under -fkeep-inline-functions. Add "#ifndef FRONTEND" where necessary to make pg_waldump build on compilers that don't elide unused static-inline functions. This back-patches relevant parts of commit `3e9ca5260`, fixing build breakage from `dc7420c2c` and back-patching of `f10f0ae42`. Per recently-resurrected buildfarm member castoroides. We aren't expecting castoroides to build anything newer than v11, but we might as well clean up the intermediate branches while at it.	2023-01-20 11:58:12 -05:00
Tom Lane	a8f7687a0b	Make new GENERATED-expressions code more bulletproof. In commit `8bf6ec3ba` I assumed that no code path could reach ExecGetExtraUpdatedCols without having gone through ExecInitStoredGenerated. That turns out not to be the case in logical replication: if there's an ON UPDATE trigger on the target table, trigger.c will call this code before anybody has set up its generated columns. Having seen that, I don't have a lot of faith in there not being other such paths. ExecGetExtraUpdatedCols can call ExecInitStoredGenerated for itself, as long as we are willing to assume that it is only called in CMD_UPDATE operations, which on the whole seems like a safer leap of faith. Per report from Vitaly Davydov. Discussion: https://postgr.es/m/d259d69652b8c2ff50e14cda3c236c7f@postgrespro.ru	2023-01-15 14:06:46 -05:00
Tom Lane	3706cc97aa	Fix calculation of which GENERATED columns need to be updated. We were identifying the updatable generated columns of inheritance children by transposing the calculation made for their parent. However, there's nothing that says a traditional-inheritance child can't have generated columns that aren't there in its parent, or that have different dependencies than are in the parent's expression. (At present it seems that we don't enforce that for partitioning either, which is likely wrong to some degree or other; but the case clearly needs to be handled with traditional inheritance.) Hence, drop the very-klugy-anyway "extraUpdatedCols" RTE field in favor of identifying which generated columns depend on updated columns during executor startup. In HEAD we can remove extraUpdatedCols altogether; in back branches, it's still there but always empty. Another difference between the HEAD and back-branch versions of this patch is that in HEAD we can add the new bitmap field to ResultRelInfo, but that would cause an ABI break in back branches. Like `4b3e37993`, add a List field at the end of struct EState instead. Back-patch to v13. The bogus calculation is also being made in v12, but it doesn't have the same visible effect because we don't use it to decide which generated columns to recalculate; as a consequence of which the patch doesn't apply easily. I think that there might still be a demonstrable bug associated with trigger firing conditions, but that's such a weird corner-case usage that I'm content to leave it unfixed in v12. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqFshLKNvQUd1DgwJ-7tsTp=dwv7KZqXC4j2wYBV1aCDUA@mail.gmail.com Discussion: https://postgr.es/m/2793383.1672944799@sss.pgh.pa.us	2023-01-05 14:12:17 -05:00
Andres Freund	c6e75e4c27	perl: Hide warnings inside perl.h when using gcc compatible compiler New versions of perl trigger warnings within perl.h with our compiler flags. At least -Wdeclaration-after-statement, -Wshadow=compatible-local are known to be problematic. To avoid these warnings, conditionally use #pragma GCC system_header before including plperl.h. Alternatively, we could add the include paths for problematic headers with -isystem, but that is a larger hammer and is harder to search for. A more granular alternative would be to use #pragma GCC diagnostic push/ignored/pop, but gcc warns about unknown warnings being ignored, so every to-be-ignored-temporarily compiler warning would require its own pg_config.h symbol and #ifdef. As the warnings are voluminous, it makes sense to backpatch this change. But don't do so yet, we first want gather buildfarm coverage - it's e.g. possible that some compiler claiming to be gcc compatible has issues with the pragma. Author: Andres Freund <andres@anarazel.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: Discussion: https://postgr.es/m/20221228182455.hfdwd22zztvkojy2@awork3.anarazel.de	2023-01-02 15:49:33 -08:00
Tom Lane	18431ee6f5	Rethink handling of [Prevent\|Is]InTransactionBlock in pipeline mode. Commits `f92944137` et al. made IsInTransactionBlock() set the XACT_FLAGS_NEEDIMMEDIATECOMMIT flag before returning "false", on the grounds that that kept its API promises equivalent to those of PreventInTransactionBlock(). This turns out to be a bad idea though, because it allows an ANALYZE in a pipelined series of commands to cause an immediate commit, which is unexpected. Furthermore, if we return "false" then we have another issue, which is that ANALYZE will decide it's allowed to do internal commit-and-start-transaction sequences, thus possibly unexpectedly committing the effects of previous commands in the pipeline. To fix the latter situation, invent another transaction state flag XACT_FLAGS_PIPELINING, which explicitly records the fact that we have executed some extended-protocol command and not yet seen a commit for it. Then, require that flag to not be set before allowing InTransactionBlock() to return "false". Having done that, we can remove its setting of NEEDIMMEDIATECOMMIT without fear of causing problems. This means that the API guarantees of IsInTransactionBlock now diverge from PreventInTransactionBlock, which is mildly annoying, but it seems OK given the very limited usage of IsInTransactionBlock. (In any case, a caller preferring the old behavior could always set NEEDIMMEDIATECOMMIT for itself.) For consistency also require XACT_FLAGS_PIPELINING to not be set in PreventInTransactionBlock. This too is meant to prevent commands such as CREATE DATABASE from silently committing previous commands in a pipeline. Per report from Peter Eisentraut. As before, back-patch to all supported branches (which sadly no longer includes v10). Discussion: https://postgr.es/m/65a899dd-aebc-f667-1d0a-abb89ff3abf8@enterprisedb.com	2022-12-13 14:23:59 -05:00
Etsuro Fujita	a0bf7a0ecc	Remove new structure member from ResultRelInfo. In commit `ffbb7e65a`, I added a ModifyTableState member to ResultRelInfo to save the owning ModifyTableState for use by nodeModifyTable.c when performing batch inserts, but as pointed out by Tom Lane, that changed the array stride of es_result_relations, and that would break any previously-compiled extension code that accesses that array. Fix by removing that member from ResultRelInfo and instead adding a List member at the end of EState to save such ModifyTableStates. Per report from Tom Lane. Back-patch to v14, like the previous commit; I chose to apply the patch to HEAD as well, to make back-patching easy. Discussion: http://postgr.es/m/4065383.1669395453%40sss.pgh.pa.us	2022-12-08 16:15:01 +09:00
Tom Lane	c959f84c2b	Fix Memoize to work with partitionwise joining. A couple of places weren't up to speed for this. By sheer good luck, we didn't fail but just selected a non-memoized join plan, at least in the test case we have. Nonetheless, it's a bug, and I'm not quite sure that it couldn't have worse consequences in other examples. So back-patch to v14 where Memoize came in. Richard Guo Discussion: https://postgr.es/m/CAMbWs48GkNom272sfp0-WeD6_0HSR19BJ4H1c9ZKSfbVnJsvRg@mail.gmail.com	2022-12-05 12:36:41 -05:00
Amit Kapila	ebf87c019c	Fix incorrect output from pgoutput when using column lists. For Updates and Deletes, we were not honoring the columns list for old tuple values while sending tuple data via pgoutput. This results in pgoutput emitting more columns than expected. This is not a problem for built-in logical replication as we simply ignore additional columns based on the relation information sent previously which didn't have those columns. However, some other users of pgoutput plugin may expect the columns as per the column list. Also, sending extra columns unnecessarily consumes network bandwidth defeating the purpose of the column list feature. Reported-by: Gunnar Morling Author: Hou Zhijie Reviewed-by: Amit Kapila Backpatch-through: 15 Discussion: https://postgr.es/m/CADGJaX9kiRZ-OH0EpWF5Fkyh1ZZYofoNRCrhapBfdk02tj5EKg@mail.gmail.com	2022-12-02 11:14:42 +05:30
Tom Lane	55fa993d7e	Improve heuristics for compressing the KnownAssignedXids array. Previously, we'd compress only when the active range of array entries reached Max(4 * PROCARRAY_MAXPROCS, 2 * pArray->numKnownAssignedXids). If max_connections is large, the first term could result in not compressing for a long time, resulting in much wastage of cycles in hot-standby backends scanning the array to take snapshots. Get rid of that term, and just bound it to 2 * pArray->numKnownAssignedXids. That however creates the opposite risk, that we might spend too much effort compressing. Hence, consider compressing only once every 128 commit records. (This frequency was chosen by benchmarking. While we only tried one benchmark scenario, the results seem stable over a fairly wide range of frequencies.) Also, force compression when processing RecoveryInfo WAL records (which should be infrequent); the old code could perform compression then, but would do so only after the same array-range check as for the transaction-commit path. Also, opportunistically run compression if the startup process is about to wait for WAL, though not oftener than once a second. This should prevent cases where we waste lots of time by leaving the array not-compressed for long intervals due to low WAL traffic. Lastly, add a simple check to keep us from uselessly compressing when the array storage is already compact. Back-patch, as the performance problem is worse in pre-v14 branches than in HEAD. Simon Riggs and Michail Nikolaev, with help from Tom Lane and Andres Freund. Discussion: https://postgr.es/m/CALdSSPgahNUD_=pB_j=1zSnDBaiOtqVfzo8Ejt5J_k7qZiU1Tw@mail.gmail.com	2022-11-29 15:43:17 -05:00
Etsuro Fujita	fc02019c09	Fix handling of pending inserts in nodeModifyTable.c. Commit `b663a4136`, which allowed FDWs to INSERT rows in bulk, added to nodeModifyTable.c code to flush pending inserts to the foreign-table result relation(s) before completing processing of the ModifyTable node, but the code failed to take into account the case where the INSERT query has modifying CTEs, leading to incorrect results. Also, that commit failed to flush pending inserts before firing BEFORE ROW triggers so that rows are visible to such triggers. In that commit we scanned through EState's es_tuple_routing_result_relations or es_opened_result_relations list to find the foreign-table result relations to which pending inserts are flushed, but that would be inefficient in some cases. So to fix, 1) add a List member to EState to record the insert-pending result relations, and 2) modify nodeModifyTable.c so that it adds the foreign-table result relation to the list in ExecInsert() if appropriate, and flushes pending inserts properly using the list where needed. While here, fix a copy-and-pasteo in a comment in ExecBatchInsert(), which was added by that commit. Back-patch to v14 where that commit appeared. Discussion: https://postgr.es/m/CAPmGK16qutyCmyJJzgQOhfBq%3DNoGDqTB6O0QBZTihrbqre%2BoxA%40mail.gmail.com	2022-11-25 17:45:01 +09:00
Tom Lane	0353db996e	Add comments and a missing CHECK_FOR_INTERRUPTS in ts_headline. I just spent an annoying amount of time reverse-engineering the 100%-undocumented API between ts_headline and the text search parser's prsheadline function. Add some commentary about that while it's fresh in mind. Also remove some unused macros in wparser_def.c. While at it, I noticed that when commit `78e73e875` added a CHECK_FOR_INTERRUPTS call in TS_execute_recurse, it missed doing so in the parallel function TS_phrase_execute, which surely needs one just as much. Back-patch because of the missing CHECK_FOR_INTERRUPTS. Might as well back-patch the rest of this too.	2022-11-21 17:07:07 -05:00
Tom Lane	2c6d43650d	Fix CREATE DATABASE so we can pg_upgrade DBs with OIDs above 2^31. Commit `aa0105141` repeated one of the oldest mistakes in our book: thinking that OID is the same as int32. It isn't of course, and unsurprisingly the first person who came along with a database OID above 2 billion broke it. Repair. Per bug #17677 from Sergey Pankov. Back-patch to v15. Discussion: https://postgr.es/m/17677-a99fa067d7ed71c9@postgresql.org	2022-11-04 10:39:52 -04:00
Tom Lane	725cd4d2e4	Add casts to simplehash.h to silence C++ warnings. Casting the result of palloc etc. to the intended type is more per project style anyway. (The fact that cpluspluscheck doesn't notice these problems is because it doesn't expand any macros, which seems like a troubling shortcoming. Don't have a good idea about improving that.) Back-patch to v13, which is as far as the patch applies cleanly; doesn't seem worth working harder. David Geier Discussion: https://postgr.es/m/aa5d88a3-71f4-3455-11cf-82de0372c941@gmail.com	2022-11-03 10:47:31 -04:00
Tom Lane	a5737e765d	Allow use of __sync_lock_test_and_set for spinlocks on any machine. If we have no special-case code in s_lock.h for the current platform, but the compiler has __sync_lock_test_and_set, use that instead of failing. It's unlikely that anybody's __sync_lock_test_and_set would be so awful as to be worse than our semaphore-based fallback, but if it is, they can (continue to) use --disable-spinlocks. This allows removal of the RISC-V special case installed by commit `c32fcac56`, which generated exactly the same code but only on that platform. Usefully, the RISC-V buildfarm animals should now test at least the int variant of this patch. I've manually tested both variants on ARM by dint of removing the ARM-specific stanza. We don't want to drop that, because it already has some special knowledge and is likely to grow more over time. Likewise, this is not meant to preclude installing special cases for other arches if that proves worthwhile. Per discussion of a request to install the same code for loongarch64. Like the previous patch, we might as well back-patch to supported branches. Discussion: https://postgr.es/m/761ac43d44b84d679ba803c2bd947cc0@HSMAILSVR04.hs.handsome.com.cn	2022-11-02 17:37:26 -04:00
David Rowley	23f4427612	Fix outdated comment in tuplesort.h This was outdated by `77bae396d`. Backpatch-through: 15, where `77bae396d` was added	2022-11-02 15:29:49 +13:00
Alvaro Herrera	fb2a83b2b7	Update some comments that should've covered MERGE Oversight in `7103ebb7aa`. Backpatch to 15. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs48gnDjZXq3-b56dVpQCNUJ5hD9kdtWN4QFwKCEapspNsA@mail.gmail.com	2022-10-24 12:52:43 +02:00
Robert Haas	5c013e620c	pg_basebackup: Fix cross-platform tablespace relocation. Specifically, when pg_basebackup is invoked with -Tx=y, don't error out if x could plausibly be an absolute path either on Windows or on non-Windows systems. We don't know whether the remote system is running the same OS as the local system, so it's not appropriate to assume that our local rule about absolute pathnames is the same as the rule on the remote system. Patch by me, reviewed by Tom Lane, Andrew Dunstan, and Davinder Singh. Discussion: http://postgr.es/m/CA+TgmoY+jC3YiskomvYKDPK3FbrmsDU7_8+wMHt02HOdJeRb0g@mail.gmail.com	2022-10-21 08:31:39 -04:00
Michael Paquier	f2f7e509e6	Rename SetSingleFuncCall() to InitMaterializedSRF() Per discussion, the existing routine name able to initialize a SRF function with materialize mode is unpopular, so rename it. Equally, the flags of this function are renamed, as of: - SRF_SINGLE_USE_EXPECTED -> MAT_SRF_USE_EXPECTED_DESC - SRF_SINGLE_BLESS -> MAT_SRF_BLESS The previous function and flags introduced in `9e98583` are kept around for compatibility purposes, so as any extension code already compiled with v15 continues to work as-is. The declarations introduced here for compatibility will be removed from HEAD in a follow-up commit. The new names have been suggested by Andres Freund and Melanie Plageman. Discussion: https://postgr.es/m/20221013194820.ciktb2sbbpw7cljm@awork3.anarazel.de Backpatch-through: 15	2022-10-18 10:22:40 +09:00
Tom Lane	24c4c26171	Rename parser token REF to REF_P to avoid a symbol conflict. In the latest version of Apple's macOS SDK, <sys/socket.h> fails to compile if "REF" is #define'd as something. Apple may or may not agree that this is a bug, and even if they do accept the bug report I filed, they probably won't fix it very quickly. In the meantime, our back branches will all fail to compile gram.y. v15 and HEAD currently escape the problem thanks to the refactoring done in `98e93a1fc`, but that's purely accidental. Moreover, since that patch removed a widely-visible inclusion of <netdb.h>, back-patching it seems too likely to break third-party code. Instead, change the token's code name to REF_P, following our usual convention for naming parser tokens that are likely to have symbol conflicts. The effects of that should be localized to the grammar and immediately surrounding files, so it seems like a safer answer. Per project policy that we want to keep recently-out-of-support branches buildable on modern systems, back-patch all the way to 9.2. Discussion: https://postgr.es/m/1803927.1665938411@sss.pgh.pa.us	2022-10-16 15:27:04 -04:00
Andres Freund	c3b5992b91	pgstat: Prevent stats reset from corrupting slotname by removing slotname Previously PgStat_StatReplSlotEntry contained the slotname, which was mainly used when writing out the stats during shutdown, to identify the slot in the serialized data (at runtime the index in ReplicationSlotCtl->replication_slots is used, but that can change during a restart). Unfortunately the slotname was overwritten when the slot's stats were reset. That turned out to only cause "real" problems if the slot was active during the reset, triggering an assertion failure at the next pgstat_report_replslot(). In other paths the stats were re-initialized during pgstat_acquire_replslot(). Fix this by removing slotname from PgStat_StatReplSlotEntry. Instead we can get the slot's name from the slot itself. Besides fixing a bug, this also is architecturally cleaner (a name is not really statistics). This is safe because stats, for a slot removed while shut down, will not be restored at startup. In 15 the slotname is not removed, but renamed, to avoid changing the stats format. In master, bump PGSTAT_FILE_FORMAT_ID. This commit does not contain a test for the fix. I think this can only be tested by a tap test starting pg_recvlogical in the background and checking pg_recvlogical's output. That type of test is notoriously hard to be reliable, so committing it shortly before the release is wrapped seems like a bad idea. Reported-by: Jaime Casanova <jcasanov@systemguards.com.ec> Author: Andres Freund <andres@anarazel.de> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/YxfagaTXUNa9ggLb@ahch-to Backpatch: 15-, where the bug was introduced in `5891c7a8ed`	2022-10-08 09:39:34 -07:00
Tom Lane	443df6e2db	Revert "Optimize order of GROUP BY keys". This reverts commit `db0d67db24` and several follow-on fixes. The idea of making a cost-based choice of the order of the sorting columns is not fundamentally unsound, but it requires cost information and data statistics that we don't really have. For example, relying on procost to distinguish the relative costs of different sort comparators is pretty pointless so long as most such comparator functions are labeled with cost 1.0. Moreover, estimating the number of comparisons done by Quicksort requires more than just an estimate of the number of distinct values in the input: you also need some idea of the sizes of the larger groups, if you want an estimate that's good to better than a factor of three or so. That's data that's often unknown or not very reliable. Worse, to arrive at estimates of the number of calls made to the lower-order-column comparison functions, the code needs to make estimates of the numbers of distinct values of multiple columns, which are necessarily even less trustworthy than per-column stats. Even if all the inputs are perfectly reliable, the cost algorithm as-implemented cannot offer useful information about how to order sorting columns beyond the point at which the average group size is estimated to drop to 1. Close inspection of the code added by `db0d67db2` shows that there are also multiple small bugs. These could have been fixed, but there's not much point if we don't trust the estimates to be accurate in-principle. Finally, the changes in cost_sort's behavior made for very large changes (often a factor of 2 or so) in the cost estimates for all sorting operations, not only those for multi-column GROUP BY. That naturally changes plan choices in many situations, and there's precious little evidence to show that the changes are for the better. Given the above doubts about whether the new estimates are really trustworthy, it's hard to summon much confidence that these changes are better on the average. Since we're hard up against the release deadline for v15, let's revert these changes for now. We can always try again later. Note: in v15, I left T_PathKeyInfo in place in nodes.h even though it's unreferenced. Removing it would be an ABI break, and it seems a bit late in the release cycle for that. Discussion: https://postgr.es/m/TYAPR01MB586665EB5FB2C3807E893941F5579@TYAPR01MB5866.jpnprd01.prod.outlook.com	2022-10-03 10:56:16 -04:00
Peter Geoghegan	1ef00afe2e	Make check_usermap() parameter names consistent. The function has a bool argument named "case_insensitive", but that was spelled "case_sensitive" in the declaration. Make them consistent now to avoid confusion in the future. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Michael Paquiër <michael@paquier.xyz> Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com Backpatch: 10-	2022-09-17 16:54:16 -07:00
Peter Eisentraut	7fe55d5e12	Expand palloc/pg_malloc API for more type safety This adds additional variants of palloc, pg_malloc, etc. that encapsulate common usage patterns and provide more type safety. Specifically, this adds palloc_object(), palloc_array(), and repalloc_array(), which take the type name of the object to be allocated as its first argument and cast the return as a pointer to that type. There are also palloc0_object() and palloc0_array() variants for initializing with zero, and pg_malloc_*() variants of all of the above. Inspired by the talloc library. This is backpatched from master so that future backpatchable code can make use of these APIs. This patch by itself does not contain any users of these APIs. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/bb755632-2a43-d523-36f8-a1e7a389a907@enterprisedb.com	2022-09-14 06:04:24 +02:00
Michael Paquier	53332eacaf	Simplify handling of compression level with compression specifications PG_COMPRESSION_OPTION_LEVEL is removed from the compression specification logic, and instead the compression level is always assigned with each library's default if nothing is directly given. This centralizes the checks on the compression methods supported by a given build, and always assigns a default compression level when parsing a compression specification. This results in complaining at an earlier stage than previously if a build supports a compression method or not, aka when parsing a specification in the backend or the frontend, and not when processing it. zstd, lz4 and zlib are able to handle in their respective routines setting up the compression level the case of a default value, hence the backend or frontend code (pg_receivewal or pg_basebackup) has now no need to know what the default compression level should be if nothing is specified: the logic is now done so as the specification parsing assigns it. It can also be enforced by passing down a "level" set to the default value, that the backend will accept (the replication protocol is for example able to handle a command like BASE_BACKUP (COMPRESSION_DETAIL 'gzip:level=-1')). This code simplification fixes an issue with pg_basebackup --gzip introduced by `ffd5365`, where the tarball of the streamed WAL segments would be created as of pg_wal.tar.gz with uncompressed contents, while the intention is to compress the segments with gzip at a default level. The origin of the confusion comes from the handling of the default compression level of gzip (-1 or Z_DEFAULT_COMPRESSION) and the value of 0 was getting assigned, which is what walmethods.c would consider as equivalent to no compression when streaming WAL segments with its tar methods. Assigning always the compression level removes the confusion of some code paths considering a value of 0 set in a specification as either no compression or a default compression level. Note that 010_pg_basebackup.pl has to be adjusted to skip a few tests where the shape of the compression detail string for client and server-side compression was checked using gzip. This is a result of the code simplification, as gzip specifications cannot be used if a build does not support it. Reported-by: Tom Lane Reviewed-by: Tom Lane Discussion: https://postgr.es/m/1400032.1662217889@sss.pgh.pa.us Backpatch-through: 15	2022-09-14 12:17:03 +09:00
Thomas Munro	dd38ff28ad	Fix recovery_prefetch with low maintenance_io_concurrency. We should process completed IOs before trying to start more, so that it is always possible to decode one more record when the decoded record queue is empty, even if maintenance_io_concurrency is set so low that a single earlier WAL record might have saturated the IO queue. That bug was hidden because the effect of maintenance_io_concurrency was arbitrarily clamped to be at least 2. Fix the ordering, and also remove that clamp. We need a special case for 0, which is now treated the same as recovery_prefetch=off, but otherwise the number is used directly. This allows for testing with 1, which would have made the problem obvious in simple test scenarios. Also add an explicit error message for missing contrecords. It was a bit strange that we didn't report an error already, and became a latent bug with prefetching, since the internal state that tracks aborted contrecords would not survive retrying, as revealed by 026_overwrite_contrecord.pl with this adjustment. Reporting an error prevents that. Back-patch to 15. Reported-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220831140128.GS31833%40telsasoft.com	2022-09-08 20:36:44 +12:00
Tom Lane	e25631265a	Fix new pg_publication_tables query. The addition of published column names forgot to filter on attisdropped, leading to cases where you could see "........pg.dropped.1........" or the like as a reportedly-published column. While we're here, rewrite the new subquery to get a more efficient plan for it. Hou Zhijie, per report from Jaime Casanova. Back-patch to v15 where the bug was introduced. (Sadly, this means we need a post-beta4 catversion bump before beta4 has even hit the streets. I see no good alternative though.) Discussion: https://postgr.es/m/Yxa1SU4nH2HfN3/i@ahch-to	2022-09-06 18:00:32 -04:00
Thomas Munro	d0d9344900	Fix cache invalidation bug in recovery_prefetch. XLogPageRead() can retry internally after a pread() system call has succeeded, in the case of short reads, and page validation failures while in standby mode (see commit `0668719801`). Due to an oversight in commit `3f1ce973`, these cases could leave stale data in the internal cache of xlogreader.c without marking it invalid. The main defense against stale cached data on failure to read a page was in the error handling path of the calling function ReadPageInternal(), but that wasn't quite enough for errors handled internally by XLogPageRead()'s retry loop if we then exited with XLREAD_WOULDBLOCK. 1. ReadPageInternal() now marks the cache invalid before calling the page_read callback, by setting state->readLen to 0. It'll be set to a non-zero value only after a successful read. It'll stay valid as long as the caller requests data in the cached range. 2. XLogPageRead() no long performs internal retries while reading ahead. While such retries should work, the general philosophy is that we should give up prefetching if anything unusual happens so we can handle it when recovery catches up, to reduce the complexity of the system. Let's do that here too. 3. While here, a new function XLogReaderResetError() improves the separation between xlogrecovery.c and xlogreader.c, where the former previously clobbered the latter's internal error buffer directly. The new function makes this more explicit, and also clears a related flag, without which a standby would needlessly retry in the outer function. Thanks to Noah Misch for tracking down the conditions required for a rare build farm failure in src/bin/pg_ctl/t/003_promote.pl, and providing a reproducer. Back-patch to 15. Reported-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20220807003627.GA4168930%40rfd.leadboat.com	2022-09-03 13:29:36 +12:00
Andrew Dunstan	96ef3237bf	Revert SQL/JSON features The reverts the following and makes some associated cleanups: commit `f79b803dc`: Common SQL/JSON clauses commit `f4fb45d15`: SQL/JSON constructors commit `5f0adec25`: Make STRING an unreserved_keyword. commit `33a377608`: IS JSON predicate commit `1a36bc9db`: SQL/JSON query functions commit `606948b05`: SQL JSON functions commit `49082c2cc`: RETURNING clause for JSON() and JSON_SCALAR() commit `4e34747c8`: JSON_TABLE commit `fadb48b00`: PLAN clauses for JSON_TABLE commit `2ef6f11b0`: Reduce running time of jsonb_sqljson test commit `14d3f24fa`: Further improve jsonb_sqljson parallel test commit `a6baa4bad`: Documentation for SQL/JSON features commit `b46bcf7a4`: Improve readability of SQL/JSON documentation. commit `112fdb352`: Fix finalization for json_objectagg and friends commit `fcdb35c32`: Fix transformJsonBehavior commit `4cd8717af`: Improve a couple of sql/json error messages commit `f7a605f63`: Small cleanups in SQL/JSON code commit `9c3d25e17`: Fix JSON_OBJECTAGG uniquefying bug commit `a79153b7a`: Claim SQL standard compliance for SQL/JSON features commit `a1e7616d6`: Rework SQL/JSON documentation commit `8d9f9634e`: Fix errors in copyfuncs/equalfuncs support for JSON node types. commit `3c633f32b`: Only allow returning string types or bytea from json_serialize commit `67b26703b`: expression eval: Fix EEOP_JSON_CONSTRUCTOR and EEOP_JSONEXPR size. The release notes are also adjusted. Backpatch to release 15. Discussion: https://postgr.es/m/40d2c882-bcac-19a9-754d-4299e1d87ac7@postgresql.org	2022-09-01 17:10:42 -04:00
Andres Freund	045ec34785	pgstat: Acquire lock when reading variable-numbered stats Somewhere during the development of the patch acquiring a lock during read access to variable-numbered stats got lost. The missing lock acquisition won't cause corruption, but can lead to reading torn values when accessing stats. Add the missing lock acquisitions. Reported-by: Greg Stark <stark@mit.edu> Reviewed-by: "Drouvot, Bertrand" <bdrouvot@amazon.com> Reviewed-by: Andres Freund <andres@anarazel.de> Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/CAM-w4HMYkM_DkYhWtUGV+qE_rrBxKOzOF0+5faozxO3vXrc9wA@mail.gmail.com Backpatch: 15-	2022-08-22 20:17:45 -07:00
Michael Paquier	522ae011fa	Allow event trigger table_rewrite for ALTER MATERIALIZED VIEW This event can happen when using SET ACCESS METHOD, as the data files of the materialized need a full refresh but this command tag was not updated to reflect that. The documentation is updated to track this behavior. Author: Onder Kalaci Discussion: https://postgr.es/m/CACawEhXwHN3X34FiwoYG8vXR-oyUdrp7qcfRWSzS+NPahS5gSw@mail.gmail.com Backpatch-through: 15	2022-08-17 14:55:24 +09:00
Amit Kapila	272248a0c1	Fix catalog lookup with the wrong snapshot during logical decoding. Previously, we relied on HEAP2_NEW_CID records and XACT_INVALIDATION records to know if the transaction has modified the catalog, and that information is not serialized to snapshot. Therefore, after the restart, if the logical decoding decodes only the commit record of the transaction that has actually modified a catalog, we will miss adding its XID to the snapshot. Thus, we will end up looking at catalogs with the wrong snapshot. To fix this problem, this changes the snapshot builder so that it remembers the last-running-xacts list of the decoded RUNNING_XACTS record after restoring the previously serialized snapshot. Then, we mark the transaction as containing catalog changes if it's in the list of initial running transactions and its commit record has XACT_XINFO_HAS_INVALS. To avoid ABI breakage, we store the array of the initial running transactions in the static variables InitialRunningXacts and NInitialRunningXacts, instead of storing those in SnapBuild or ReorderBuffer. This approach has a false positive; we could end up adding the transaction that didn't change catalog to the snapshot since we cannot distinguish whether the transaction has catalog changes only by checking the COMMIT record. It doesn't have the information on which (sub) transaction has catalog changes, and XACT_XINFO_HAS_INVALS doesn't necessarily indicate that the transaction has catalog change. But that won't be a problem since we use snapshot built during decoding only to read system catalogs. On the master branch, we took a more future-proof approach by writing catalog modifying transactions to the serialized snapshot which avoids the above false positive. But we cannot backpatch it because of a change in the SnapBuild. Reported-by: Mike Oh Author: Masahiko Sawada Reviewed-by: Amit Kapila, Shi yu, Takamichi Osumi, Kyotaro Horiguchi, Bertrand Drouvot, Ahsan Hadi Backpatch-through: 10 Discussion: https://postgr.es/m/81D0D8B0-E7C4-4999-B616-1E5004DBDCD2%40amazon.com	2022-08-11 09:56:49 +05:30
Robert Haas	f88798c098	Move basebackup code to new directory src/backend/backup Reviewed by David Steele and Justin Pryzby Discussion: http://postgr.es/m/CA+TgmoafqboATDSoXHz8VLrSwK_MDhjthK4hEpYjqf9_1Fmczw%40mail.gmail.com	2022-08-10 14:03:09 -04:00
Tom Lane	cc7e0feba5	In extensions, don't replace objects not belonging to the extension. Previously, if an extension script did CREATE OR REPLACE and there was an existing object not belonging to the extension, it would overwrite the object and adopt it into the extension. This is problematic, first because the overwrite is probably unintentional, and second because we didn't change the object's ownership. Thus a hostile user could create an object in advance of an expected CREATE EXTENSION command, and would then have ownership rights on an extension object, which could be modified for trojan-horse-type attacks. Hence, forbid CREATE OR REPLACE of an existing object unless it already belongs to the extension. (Note that we've always forbidden replacing an object that belongs to some other extension; only the behavior for previously-free-standing objects changes here.) For the same reason, also fail CREATE IF NOT EXISTS when there is an existing object that doesn't belong to the extension. Our thanks to Sven Klemm for reporting this problem. Security: CVE-2022-2625	2022-08-08 11:12:31 -04:00
Alvaro Herrera	e78fd90845	Fix ENABLE/DISABLE TRIGGER to handle recursion correctly Using ATSimpleRecursion() in ATPrepCmd() to do so as `bbb927b4db` did is not correct, because ATPrepCmd() can't distinguish between triggers that may be cloned and those that may not, so would wrongly try to recurse for the latter category of triggers. So this commit restores the code in EnableDisableTrigger() that `86f575948c` had added to do the recursion, which would do it only for triggers that may be cloned, that is, row-level triggers. This also changes tablecmds.c such that ATExecCmd() is able to pass the value of ONLY flag down to EnableDisableTrigger() using its new 'recurse' parameter. This also fixes what seems like an oversight of `86f575948c` that the recursion to partition triggers would only occur if EnableDisableTrigger() had actually changed the trigger. It is more apt to recurse to inspect partition triggers even if the parent's trigger didn't need to be changed: only then can we be certain that all descendants share the same state afterwards. Backpatch all the way back to 11, like `bbb927b4db`. Care is taken not to break ABI compatibility (and that no catversion bump is needed.) Co-authored-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru> Discussion: https://postgr.es/m/CA+HiwqG-cZT3XzGAnEgZQLoQbyfJApVwOTQaCaas1mhpf+4V5A@mail.gmail.com	2022-08-05 09:47:15 +02:00
Tom Lane	14168d57bb	Fix incorrect tests for SRFs in relation_can_be_sorted_early(). Commit `fac1b470a` thought we could check for set-returning functions by testing only the top-level node in an expression tree. This is wrong in itself, and to make matters worse it encouraged others to make the same mistake, by exporting tlist.c's special-purpose IS_SRF_CALL() as a widely-visible macro. I can't find any evidence that anyone's taken the bait, but it was only a matter of time. Use expression_returns_set() instead, and stuff the IS_SRF_CALL() genie back in its bottle, this time with a warning label. I also added a couple of cross-reference comments. After a fair amount of fooling around, I've despaired of making a robust test case that exposes the bug reliably, so no test case here. (Note that the test case added by `fac1b470a` is itself broken, in that it doesn't notice if you remove the code change. The repro given by the bug submitter currently doesn't fail either in v15 or HEAD, though I suspect that may indicate an unrelated bug.) Per bug #17564 from Martijn van Oosterhout. Back-patch to v13, as the faulty patch was. Discussion: https://postgr.es/m/17564-c7472c2f90ef2da3@postgresql.org	2022-08-03 17:33:42 -04:00
Tom Lane	a0c632c1de	Force immediate commit after CREATE DATABASE etc in extended protocol. We have a few commands that "can't run in a transaction block", meaning that if they complete their processing but then we fail to COMMIT, we'll be left with inconsistent on-disk state. However, the existing defenses for this are only watertight for simple query protocol. In extended protocol, we didn't commit until receiving a Sync message. Since the client is allowed to issue another command instead of Sync, we're in trouble if that command fails or is an explicit ROLLBACK. In any case, sitting in an inconsistent state while waiting for a client message that might not come seems pretty risky. This case wasn't reachable via libpq before we introduced pipeline mode, but it's always been an intended aspect of extended query protocol, and likely there are other clients that could reach it before. To fix, set a flag in PreventInTransactionBlock that tells exec_execute_message to force an immediate commit. This seems to be the approach that does least damage to existing working cases while still preventing the undesirable outcomes. While here, add some documentation to protocol.sgml that explicitly says how to use pipelining. That's latent in the existing docs if you know what to look for, but it's better to spell it out; and it provides a place to document this new behavior. Per bug #17434 from Yugo Nagata. It's been wrong for ages, so back-patch to all supported branches. Discussion: https://postgr.es/m/17434-d9f7a064ce2a88a3@postgresql.org	2022-07-26 13:07:03 -04:00
Tom Lane	31ed3cf746	Process session_preload_libraries within InitPostgres's transaction. Previously we did this after InitPostgres, at a somewhat randomly chosen place within PostgresMain. However, since commit `a0ffa885e` doing this outside a transaction can cause a crash, if we need to check permissions while replacing a placeholder GUC. (Besides which, a preloaded library could itself want to do database access within _PG_init.) To avoid needing an additional transaction start/end in every session, move the process_session_preload_libraries call to within InitPostgres's transaction. That requires teaching the code not to call it when InitPostgres is called from somewhere other than PostgresMain, since we don't want session_preload_libraries to affect background workers. The most future-proof solution here seems to be to add an additional flag parameter to InitPostgres; fortunately, we're not yet very worried about API stability for v15. Doing this also exposed the fact that we're currently honoring session_preload_libraries in walsenders, even those not connected to any database. This seems, at minimum, a POLA violation: walsenders are not interactive sessions. Let's stop doing that. (All these comments also apply to local_preload_libraries, of course.) Per report from Gurjeet Singh (thanks also to Nathan Bossart and Kyotaro Horiguchi for review). Backpatch to v15 where `a0ffa885e` came in. Discussion: https://postgr.es/m/CABwTF4VEpwTHhRQ+q5MiC5ucngN-whN-PdcKeufX7eLSoAfbZA@mail.gmail.com	2022-07-25 10:27:43 -04:00
Tom Lane	a2944d8724	Fix missed corner cases for grantable permissions on GUCs. We allow users to set the values of not-yet-loaded extension GUCs, remembering those values in "placeholder" GUC entries. When/if the extension is loaded later in the session, we need to verify that the user had permissions to set the GUC. That was done correctly before commit `a0ffa885e`, but as of that commit, we'd check the permissions of the active role when the LOAD happens, not the role that had set the value. (This'd be a security bug if it had made it into a released version.) In principle this is simple enough to fix: we just need to remember the exact role OID that set each GUC value, and use that not GetUserID() when verifying permissions. Maintaining that data in the guc.c data structures is slightly tedious, but fortunately it's all basically just copy-n-paste of the logic for tracking the GucSource of each setting, as we were already doing. Another oversight is that validate_option_array_item() hadn't been taught to check for granted GUC privileges. This appears to manifest only in that ALTER ROLE/DATABASE RESET ALL will fail to reset settings that the user should be allowed to reset. Patch by myself and Nathan Bossart, per report from Nathan Bossart. Back-patch to v15 where the faulty code came in. Discussion: https://postgr.es/m/20220706224727.GA2158260@nathanxps13	2022-07-19 17:22:31 -04:00
Andres Freund	92f478657c	windows: msvc: Define STDIN/OUT/ERR_FILENO. Because they are not available we've used _fileno(stdin) in some places, but that doesn't reliably work, because stdin might be closed. This is the prerequisite of the subsequent commit, fixing a failure introduced in `76e38b37a5`. Author: Andres Freund <andres@anarazel.de> Reported-By: Sandeep Thakkar <sandeep.thakkar@enterprisedb.com> Message-Id: 20220520164558.ozb7lm6unakqzezi@alap3.anarazel.de (on pgsql-packagers) Backpatch: 15-, where `76e38b37a5` came in	2022-07-18 17:22:35 -07:00
David Rowley	8c297dd5f4	Fix inconsistent parameter names between prototype and declaration Noticed while working in this area. This code was introduced in PG15, which is still in beta, so backpatch to there for consistency. Backpatch-through: 15	2022-07-15 15:27:14 +12:00

1 2 3 4 5 ...

10489 commits