postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-06-23 15:39:12 -04:00

Author	SHA1	Message	Date
Alvaro Herrera	ff0d8f27f4	Remove redundant declaration for XidInMVCCSnapshot This was added for no good reason by `c91560defc`, after `b7eda3e0e3` had just moved the prototype from utils/tqual.h to utils/snapmgr.h. Author: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/MEYP282MB16693A409F3282A9DB287BADB63E9@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM	2022-11-09 18:30:09 +01:00
Alvaro Herrera	9c0de04242	Reduce xlog.h inclusion footprint This file needs xlogreader.h only for the XLogReaderState typedef; but we can dodge that by forward-declaring it. Many files use xlog.h for reasons other than reading WAL, and it's not good to force all those files to include xlogreader.h, so take it out. Surprisingly, there is no fallout in core code from making this change. Perhaps external code will have to start including xlogreader.h.	2022-10-12 09:47:11 +02:00
Thomas Munro	b6d8a60aba	Restore pg_pread and friends. Commits `cf112c12` and `a0dc8271` were a little too hasty in getting rid of the pg_ prefixes where we use pread(), pwrite() and vectored variants. We dropped support for ancient Unixes where we needed to use lseek() to implement replacements for those, but it turns out that Windows also changes the current position even when you pass in an offset to ReadFile() and WriteFile() if the file handle is synchronous, despite its documentation saying otherwise. Switching to asynchronous file handles would fix that, but have other complications. For now let's just put back the pg_ prefix and add some comments to highlight the non-standard side-effect, which we can now describe as Windows-only. Reported-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/20220923202439.GA1156054%40nathanxps13	2022-09-29 13:12:11 +13:00
Robert Haas	a448e49bcb	Revert 56-bit relfilenode change and follow-up commits. There are still some alignment-related failures in the buildfarm, which might or might not be able to be fixed quickly, but I've also just realized that it increased the size of many WAL records by 4 bytes because a block reference contains a RelFileLocator. The effect of that hasn't been studied or discussed, so revert for now.	2022-09-28 09:55:28 -04:00
Peter Eisentraut	c8b2ef05f4	Convert GetDatum() and DatumGet() macros to inline functions The previous macro implementations just cast the argument to a target type but did not check whether the input type was appropriate. The function implementation can do better type checking of the input type. For the *GetDatumFast() macros, converting to an inline function doesn't work in the !USE_FLOAT8_BYVAL case, but we can use AssertVariableIsOfTypeMacro() to get a similar level of type checking. Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/8528fb7e-0aa2-6b54-85fb-0c0886dbd6ed%40enterprisedb.com	2022-09-27 20:50:21 +02:00
Robert Haas	05d4cbf9b6	Increase width of RelFileNumbers from 32 bits to 56 bits. RelFileNumbers are now assigned using a separate counter, instead of being assigned from the OID counter. This counter never wraps around: if all 2^56 possible RelFileNumbers are used, an internal error occurs. As the cluster is limited to 2^64 total bytes of WAL, this limitation should not cause a problem in practice. If the counter were 64 bits wide rather than 56 bits wide, we would need to increase the width of the BufferTag, which might adversely impact buffer lookup performance. Also, this lets us use bigint for pg_class.relfilenode and other places where these values are exposed at the SQL level without worrying about overflow. This should remove the need to keep "tombstone" files around until the next checkpoint when relations are removed. We do that to keep RelFileNumbers from being recycled, but now that won't happen anyway. However, this patch doesn't actually change anything in this area; it just makes it possible for a future patch to do so. Dilip Kumar, based on an idea from Andres Freund, who also reviewed some earlier versions of the patch. Further review and some wordsmithing by me. Also reviewed at various points by Ashutosh Sharma, Vignesh C, Amul Sul, Álvaro Herrera, and Tom Lane. Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com	2022-09-27 13:25:21 -04:00
Michael Paquier	78fdb1e50f	Mark ParallelMessagePending as sig_atomic_t ParallelMessagePending was previously marked as a boolean which should be fine on modern platforms, but the C standard recommends the use of sig_atomic_t for variables manipulated in signal handlers. Author: Hayato Kuroda Discussion: https://postgr.es/m/TYAPR01MB58667C15A95A234720F4F876F5529@TYAPR01MB5866.jpnprd01.prod.outlook.com	2022-09-27 09:29:56 +09:00
Michael Paquier	e1e6f8f3df	Remove dependency to StringInfo in xlogbackup.{c.h} This was used as the returned result type of the generated contents for the backup_label and backup history files. This is replaced by a simple string, reducing the cleanup burden of all the callers of build_backup_content(). Reviewed-by: Bharath Rupireddy Discussion: https://postgr.es/m/YzERvNPaZivHEKZJ@paquier.xyz	2022-09-27 09:15:07 +09:00
Michael Paquier	7d708093b7	Refactor creation of backup_label and backup history files This change simplifies some of the logic related to the generation and creation of the backup_label and backup history files, which has become unnecessarily complicated since the removal of the exclusive backup mode in commit `39969e2`. The code was previously generating the contents of these files as a string (start phase for the backup_label and stop phase for the backup history file), one problem being that the contents of the backup_label string were scanned to grab some of its internal contents at the stop phase. This commit changes the logic so as we store the data required to build these files in an intermediate structure named BackupState. The backup_label file and backup history file strings are generated when they are ready to be sent back to the client. Both files are now generated with the same code path. While on it, this commit renames some variables for clarity. Two new files named xlogbackup.{c,h} are introduced in this commit, to remove from xlog.c some of the logic around base backups. Note that more could be moved to this new set of files. Author: Bharath Rupireddy, Michael Paquier Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/CALj2ACXWwTDgJqCjdaPyfR7djwm6SrybGcrZyrvojzcsmt4FFw@mail.gmail.com	2022-09-26 11:15:47 +09:00
Peter Geoghegan	bfcf1b3480	Harmonize parameter names in storage and AM code. Make sure that function declarations use names that exactly match the corresponding names from function definitions in storage, catalog, access method, executor, and logical replication code, as well as in miscellaneous utility/library code. Like other recent commits that cleaned up function parameter names, this commit was written with help from clang-tidy. Later commits will do the same for other parts of the codebase. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com	2022-09-19 19:18:36 -07:00
Peter Geoghegan	4bac9600f0	Harmonize heapam and tableam parameter names. Make sure that function declarations use names that exactly match the corresponding names from function definitions. Having parameter names that are reliably consistent in this way will make it easier to reason about groups of related C functions from the same translation unit as a module. It will also make certain refactoring tasks easier. Like other recent commits that cleaned up function parameter names, this commit was written with help from clang-tidy. Later commits will do the same for other parts of the codebase. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com	2022-09-19 16:46:23 -07:00
Tom Lane	0a20ff54f5	Split up guc.c for better build speed and ease of maintenance. guc.c has grown to be one of our largest .c files, making it a bottleneck for compilation. It's also acquired a bunch of knowledge that'd be better kept elsewhere, because of our not very good habit of putting variable-specific check hooks here. Hence, split it up along these lines: * guc.c itself retains just the core GUC housekeeping mechanisms. * New file guc_funcs.c contains the SET/SHOW interfaces and some SQL-accessible functions for GUC manipulation. * New file guc_tables.c contains the data arrays that define the built-in GUC variables, along with some already-exported constant tables. * GUC check/assign/show hook functions are moved to the variable's home module, whenever that's clearly identifiable. A few hard- to-classify hooks ended up in commands/variable.c, which was already a home for miscellaneous GUC hook functions. To avoid cluttering a lot more header files with #include "guc.h", I also invented a new header file utils/guc_hooks.h and put all the GUC hook functions' declarations there, regardless of their originating module. That allowed removal of #include "guc.h" from some existing headers. The fallout from that (hopefully all caught here) demonstrates clearly why such inclusions are best minimized: there are a lot of files that, for example, were getting array.h at two or more levels of remove, despite not having any connection at all to GUCs in themselves. There is some very minor code beautification here, such as renaming a couple of inconsistently-named hook functions and improving some comments. But mostly this just moves code from point A to point B and deals with the ensuing needs for #include adjustments and exporting a few functions that previously weren't exported. Patch by me, per a suggestion from Andres Freund; thanks also to Michael Paquier for the idea to invent guc_funcs.c. Discussion: https://postgr.es/m/587607.1662836699@sss.pgh.pa.us	2022-09-13 11:11:45 -04:00
Peter Eisentraut	e8d78581bb	Revert "Convert GetDatum() and DatumGet() macros to inline functions" This reverts commit `595836e99b`. It has problems when USE_FLOAT8_BYVAL is off.	2022-09-12 19:57:07 +02:00
Peter Eisentraut	595836e99b	Convert GetDatum() and DatumGet() macros to inline functions The previous macro implementations just cast the argument to a target type but did not check whether the input type was appropriate. The function implementation can do better type checking of the input type. Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://www.postgresql.org/message-id/flat/8528fb7e-0aa2-6b54-85fb-0c0886dbd6ed%40enterprisedb.com	2022-09-12 17:36:26 +02:00
Thomas Munro	adb466150b	Fix recovery_prefetch with low maintenance_io_concurrency. We should process completed IOs before trying to start more, so that it is always possible to decode one more record when the decoded record queue is empty, even if maintenance_io_concurrency is set so low that a single earlier WAL record might have saturated the IO queue. That bug was hidden because the effect of maintenance_io_concurrency was arbitrarily clamped to be at least 2. Fix the ordering, and also remove that clamp. We need a special case for 0, which is now treated the same as recovery_prefetch=off, but otherwise the number is used directly. This allows for testing with 1, which would have made the problem obvious in simple test scenarios. Also add an explicit error message for missing contrecords. It was a bit strange that we didn't report an error already, and became a latent bug with prefetching, since the internal state that tracks aborted contrecords would not survive retrying, as revealed by 026_overwrite_contrecord.pl with this adjustment. Reporting an error prevents that. Back-patch to 15. Reported-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220831140128.GS31833%40telsasoft.com	2022-09-08 21:44:55 +12:00
Thomas Munro	932b016300	Fix cache invalidation bug in recovery_prefetch. XLogPageRead() can retry internally after a pread() system call has succeeded, in the case of short reads, and page validation failures while in standby mode (see commit `0668719801`). Due to an oversight in commit `3f1ce973`, these cases could leave stale data in the internal cache of xlogreader.c without marking it invalid. The main defense against stale cached data on failure to read a page was in the error handling path of the calling function ReadPageInternal(), but that wasn't quite enough for errors handled internally by XLogPageRead()'s retry loop if we then exited with XLREAD_WOULDBLOCK. 1. ReadPageInternal() now marks the cache invalid before calling the page_read callback, by setting state->readLen to 0. It'll be set to a non-zero value only after a successful read. It'll stay valid as long as the caller requests data in the cached range. 2. XLogPageRead() no long performs internal retries while reading ahead. While such retries should work, the general philosophy is that we should give up prefetching if anything unusual happens so we can handle it when recovery catches up, to reduce the complexity of the system. Let's do that here too. 3. While here, a new function XLogReaderResetError() improves the separation between xlogrecovery.c and xlogreader.c, where the former previously clobbered the latter's internal error buffer directly. The new function makes this more explicit, and also clears a related flag, without which a standby would needlessly retry in the outer function. Thanks to Noah Misch for tracking down the conditions required for a rare build farm failure in src/bin/pg_ctl/t/003_promote.pl, and providing a reproducer. Back-patch to 15. Reported-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20220807003627.GA4168930%40rfd.leadboat.com	2022-09-03 13:28:43 +12:00
Amit Kapila	c98b6acdb2	Update the comment in rmgrlist.h to match it to the code. Author: Hayato Kuroda Reviwed-by: Amit Kapila Discussion: https://postgr.es/m/TYAPR01MB58665F20F412EDF27B0759CFF5769@TYAPR01MB5866.jpnprd01.prod.outlook.com	2022-08-30 09:16:41 +05:30
Peter Geoghegan	b2fe783aec	Add missing parenthesis to max item size macro. Oversight in commit `92f37505`, per buildfarm.	2022-08-05 13:06:19 -07:00
Thomas Munro	d2e150831a	Remove configure probe for fdatasync. fdatasync() is in SUSv2, and all targeted Unix systems have it. We have a replacement function for Windows. We retain the probe for the function declaration, which allows us to supply the mysteriously missing declaration for macOS, and also for Windows. No need to keep a HAVE_FDATASYNC macro around. Also rename src/port/fdatasync.c to win32fdatasync.c since it's only for Windows. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com Discussion: https://postgr.es/m/CA%2BhUKGJZJVO%3DiX%2Beb-PXi2_XS9ZRqnn_4URh0NUQOwt6-_51xQ%40mail.gmail.com	2022-08-05 16:37:38 +12:00
Peter Geoghegan	92f375056c	Fix nbtree maximum item size macro. Commit `dd299df818`, which made heap TID a tiebreaker nbtree index column, introduced new rules on page space management to make suffix truncation safe for v4+ indexes. New pivot tuples (generated by suffix truncation during leaf page splits) sometimes require dedicated extra space at the end of a new leaf page high key/pivot to store a heap TID using a special representation (a representation only used in pivot tuples). The definition of "1/3 of a page" was reduced by a single MAXALIGN() quantum for v4 indexes to make sure that the final enlarged pivot tuple always fit, even with a split point whose firstright tuple happened to already be at the "1/3 of a page" limit (limit for non-pivot tuples). Internal pages (which only contain pivot tuples) stuck with the original "1/3 of a page" definition. This scheme made it impossible for any page split to fail to free enough space for its newitem, which is never okay. The macro that determines whether non-pivot tuples exceed their "1/3 of a leaf page" restriction was structured as if space was needed for all three tuples during a leaf page split (the new pivot plus two very large adjoining non-pivots that are separated by the split). This was subtly wrong, in that it accidentally relied on implementation details that could (at least in theory) change in the future. To fix, make the macro subtract a single MAXALIGN() quantum, once. The macro evaluates to exactly the same value as before in practice. But it no longer depends on the current layout of nbtree's special area struct. No backpatch, since this isn't a live bug. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Robert Haas <robertmhaas@gmail.com> Diagnosed-By: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA+Tgmoa7UBxivM7f6Ocx_qbq4=ky3uXc+WZNOBcVX+kvJvWOEA@mail.gmail.com	2022-08-04 20:55:02 -07:00
Thomas Munro	cf112c1220	Remove dead pread and pwrite replacement code. pread() and pwrite() are in SUSv2, and all targeted Unix systems have them. Previously, we defined pg_pread and pg_pwrite to emulate these function with lseek() on old Unixen. The names with a pg_ prefix were a reminder of a portability hazard: they might change the current file position. That hazard is gone, so we can drop the prefixes. Since the remaining replacement code is Windows-only, move it into src/port/win32p{read,write}.c, and move the declarations into src/include/port/win32_port.h. No need for vestigial HAVE_PREAD, HAVE_PWRITE macros as they were only used for declarations in port.h which have now moved into win32_port.h. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Greg Stark <stark@mit.edu> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA+hUKGJ3LHeP9w5Fgzdr4G8AnEtJ=z=p6hGDEm4qYGEUX5B6fQ@mail.gmail.com	2022-08-05 09:49:21 +12:00
Michael Paquier	245e14e28b	Fix inconsistent comments for some function declarations in headers Some of the headers list a couple of function prototypes located in a different file than what is referred to. This fixes a couple of places where this inconsistency exists. Author: Richard Guo Discussion: https://postgr.es/m/CAMbWs4__RdcSNXPa7L62Ozvo_Q4LvT60o3Bnp8yrQ_m9y5CKvg@mail.gmail.com	2022-08-04 17:36:21 +09:00
Michael Paquier	ffd1b6bb6f	Add overflow protection for block-related data in WAL records XLogRecordBlockHeader, the header holding the information for the data related to a block, tracks the length of the data appended to the WAL record with data_length (uint16). This limitation in size was not enforced by the public routine in charge of registering the data assembled later to form the WAL record inserted, XLogRegisterBufData(). Incorrectly used, it could lead to the generation of records with some of its data overflowed. This commit adds some safeguards to prevent that for the block data, complaining immediately if attempting to add to a record block information with a size larger than UINT16_MAX, which is the limit implied by the internal logic. Note that this also adjusts XLogRegisterData() and XLogRegisterBufData() so as the length of the WAL record data given by the caller is unsigned, matching with what gets stored in XLogRecData->len. Extracted from a larger patch by the same author. The original patch includes more protections when assembling a record in full that will be looked at separately later. Author: Matthias van de Meent Reviewed-by: Andres Freund, Heikki Linnakangas, Michael Paquier, David Zhang Discussion: https://postgr.es/m/CAEze2WgGiw+LZt+vHf8tWqB_6VxeLsMeoAuod0N=ij1q17n5pw@mail.gmail.com	2022-07-27 13:35:40 +09:00
Tom Lane	f92944137c	Force immediate commit after CREATE DATABASE etc in extended protocol. We have a few commands that "can't run in a transaction block", meaning that if they complete their processing but then we fail to COMMIT, we'll be left with inconsistent on-disk state. However, the existing defenses for this are only watertight for simple query protocol. In extended protocol, we didn't commit until receiving a Sync message. Since the client is allowed to issue another command instead of Sync, we're in trouble if that command fails or is an explicit ROLLBACK. In any case, sitting in an inconsistent state while waiting for a client message that might not come seems pretty risky. This case wasn't reachable via libpq before we introduced pipeline mode, but it's always been an intended aspect of extended query protocol, and likely there are other clients that could reach it before. To fix, set a flag in PreventInTransactionBlock that tells exec_execute_message to force an immediate commit. This seems to be the approach that does least damage to existing working cases while still preventing the undesirable outcomes. While here, add some documentation to protocol.sgml that explicitly says how to use pipelining. That's latent in the existing docs if you know what to look for, but it's better to spell it out; and it provides a place to document this new behavior. Per bug #17434 from Yugo Nagata. It's been wrong for ages, so back-patch to all supported branches. Discussion: https://postgr.es/m/17434-d9f7a064ce2a88a3@postgresql.org	2022-07-26 13:07:03 -04:00
Thomas Munro	a1b56090eb	Remove O_FSYNC and associated macros. O_FSYNC was a pre-POSIX way of spelling O_SYNC, supported since commit `9d645fd84c` for non-conforming operating systems of the time. It's not needed on any modern system. We can just use standard O_SYNC directly if it exists (= all targeted systems except Windows), and get rid of our OPEN_SYNC_FLAG macro. Similarly for standard O_DSYNC, we can just use that directly if it exists (= all targeted systems except DragonFlyBSD), and get rid of our OPEN_DATASYNC_FLAG macro. We still avoid choosing open_datasync as a default value for wal_sync_method if O_DSYNC has the same value as O_SYNC (= only OpenBSD), so there is no change in default behavior. Discussion: https://postgr.es/m/CA%2BhUKGJE7y92NY7FG2ftUbZUaqohBU65_Ys_7xF5mUHo4wirTQ%40mail.gmail.com	2022-07-22 12:41:17 +12:00
Peter Eisentraut	14a8bd9827	Convert macros to static inline functions (itup.h) Reviewed-by: Amul Sul <sulamul@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5b558da8-99fb-0a99-83dd-f72f05388517%40enterprisedb.com	2022-07-19 07:24:55 +02:00
Peter Eisentraut	f58d7073b7	Convert macros to static inline functions (tupmacs.h) Reviewed-by: Amul Sul <sulamul@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5b558da8-99fb-0a99-83dd-f72f05388517%40enterprisedb.com	2022-07-18 08:01:27 +02:00
Peter Eisentraut	507ba16b28	Convert macros to static inline functions (xlog_internal.h) Reviewed-by: Amul Sul <sulamul@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5b558da8-99fb-0a99-83dd-f72f05388517%40enterprisedb.com	2022-07-15 14:56:00 +02:00
David Rowley	0229106afa	Overload index_form_tuple to allow the memory context to be supplied `40af10b57` changed things so we make use of a generation memory context for storing tuples to be sorted by tuplesort.c. That change does not play nicely with the changes made in `9f03ca915` (back in 2014). That commit changed things so that index_form_tuple() is called while switched into the tuplestore's tuplecontext. In order to fetch the tuple from the index, index_form_tuple() must do various memory allocations which are unrelated to the storage of the final returned tuple. Although all of these allocations are pfree'd, the fact that we now use a generation context means that the memory for these pfree'd allocations won't be used again by any other allocation due to generation.c's lack of freelists. This could result in sorts used for building indexes exceeding maintenance_work_mem by a very large amount. Here we fix it so we no longer allocate anything apart from the tuple itself into the generation context by adding a new version of index_form_tuple named index_form_tuple_context, which can be called to specify the MemoryContext to allocate the tuple into. Discussion: https://postgr.es/m/CAApHDvrHQkiFRHiGiAS-LMOvJN-eK-s762=tVzBz8ZqUea-a_A@mail.gmail.com Backpatch-through: 15, where `40af10b57` was added.	2022-07-07 08:14:00 +12:00
Robert Haas	b0a55e4329	Change internal RelFileNode references to RelFileNumber or RelFileLocator. We have been using the term RelFileNode to refer to either (1) the integer that is used to name the sequence of files for a certain relation within the directory set aside for that tablespace/database combination; or (2) that value plus the OIDs of the tablespace and database; or occasionally (3) the whole series of files created for a relation based on those values. Using the same name for more than one thing is confusing. Replace RelFileNode with RelFileNumber when we're talking about just the single number, i.e. (1) from above, and with RelFileLocator when we're talking about all the things that are needed to locate a relation's files on disk, i.e. (2) from above. In the places where we refer to (3) as a relfilenode, instead refer to "relation storage". Since there is a ton of SQL code in the world that knows about pg_class.relfilenode, don't change the name of that column, or of other SQL-facing things that derive their name from it. On the other hand, do adjust closely-related internal terminology. For example, the structure member names dbNode and spcNode appear to be derived from the fact that the structure itself was called RelFileNode, so change those to dbOid and spcOid. Likewise, various variables with names like rnode and relnode get renamed appropriately, according to how they're being used in context. Hopefully, this is clearer than before. It is also preparation for future patches that intend to widen the relfilenumber fields from its current width of 32 bits. Variables that store a relfilenumber are now declared as type RelFileNumber rather than type Oid; right now, these are the same, but that can now more easily be changed. Dilip Kumar, per an idea from me. Reviewed also by Andres Freund. I fixed some whitespace issues, changed a couple of words in a comment, and made one other minor correction. Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com	2022-07-06 11:39:09 -04:00
Heikki Linnakangas	adf6d5dfb2	Fix visibility check when XID is committed in CLOG but not in procarray. TransactionIdIsInProgress had a fast path to return 'false' if the single-item CLOG cache said that the transaction was known to be committed. However, that was wrong, because a transaction is first marked as committed in the CLOG but doesn't become visible to others until it has removed its XID from the proc array. That could lead to an error: ERROR: t_xmin is uncommitted in tuple to be updated or for an UPDATE to go ahead without blocking, before the previous UPDATE on the same row was made visible. The window is usually very short, but synchronous replication makes it much wider, because the wait for synchronous replica happens in that window. Another thing that makes it hard to hit is that it's hard to get such a commit-in-progress transaction into the single item CLOG cache. Normally, if you call TransactionIdIsInProgress on such a transaction, it determines that the XID is in progress without checking the CLOG and without populating the cache. One way to prime the cache is to explicitly call pg_xact_status() on the XID. Another way is to use a lot of subtransactions, so that the subxid cache in the proc array is overflown, making TransactionIdIsInProgress rely on pg_subtrans and CLOG checks. This has been broken ever since it was introduced in 2008, but the race condition is very hard to hit, especially without synchronous replication. There were a couple of reports of the error starting from summer 2021, but no one was able to find the root cause then. TransactionIdIsKnownCompleted() is now unused. In 'master', remove it, but I left it in place in backbranches in case it's used by extensions. Also change pg_xact_status() to check TransactionIdIsInProgress(). Previously, it only checked the CLOG, and returned "committed" before the transaction was actually made visible to other queries. Note that this also means that you cannot use pg_xact_status() to reproduce the bug anymore, even if the code wasn't fixed. Report and analysis by Konstantin Knizhnik. Patch by Simon Riggs, with the pg_xact_status() change added by me. Author: Simon Riggs Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/flat/4da7913d-398c-e2ad-d777-f752cf7f0bbb%40garret.ru	2022-06-27 08:21:08 +03:00
Tomas Vondra	e3fcca0d0d	Revert changes in HOT handling of BRIN indexes This reverts commits `5753d4ee32` and `fe60b67250` that modified HOT to ignore BRIN indexes. The commit message for `5753d4ee32` claims that: When determining whether an index update may be skipped by using HOT, we can ignore attributes indexed only by BRIN indexes. There are no index pointers to individual tuples in BRIN, and the page range summary will be updated anyway as it relies on visibility info. This is partially incorrect - it's true BRIN indexes don't point to individual tuples, so HOT chains are not an issue, but the visibitlity info is not sufficient to keep the index up to date. This can easily result in corrupted indexes, as demonstrated in the hackers thread. This does not mean relaxing the HOT restrictions for BRIN is a lost cause, but it needs to handle the two aspects (allowing HOT chains and updating the page range summaries) as separate. But that requires a major changes, and it's too late for that in the current dev cycle. Reported-by: Tomas Vondra Discussion: https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com	2022-06-16 15:02:49 +02:00
Tom Lane	e19272ef60	Remove unused-and-misspelled function extern declaration. Commit `c65507763` added "extern XLogRecPtr CalculateMaxmumSafeLSN(void)", which bears no trace of connection to anything else in that patch or anywhere else. Remove it again. Sergei Kornilov (also spotted by Bharath Rupireddy) Discussion: https://postgr.es/m/706501646056870@vla3-6a5326aeb4ee.qloud-c.yandex.net Discussion: https://postgr.es/m/CALj2ACVoQ7NEf43Xz0rfxsGOKYTN5r4VZp2DO2_5p+CMzsRPFw@mail.gmail.com	2022-05-21 13:26:08 -04:00
Andres Freund	905c020bef	Add missing 'extern' to function prototypes. Postgres style is to spell out extern. Noticed while scripting adding PGDLLIMPORT markers to functions. Discussion: https://postgr.es/m/20220512164513.vaheofqp2q24l65r@alap3.anarazel.de	2022-05-12 12:39:33 -07:00
Tom Lane	23e7b38bfe	Pre-beta mechanical code beautification. Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.	2022-05-12 15:17:30 -04:00
Jeff Davis	ed57cac84d	pg_walinspect: fix case where flush LSN is in the middle of a record. Instability in the test for pg_walinspect revealed that pg_get_wal_records_info_till_end_of_wal(x) would try to decode all the records with a start LSN earlier than the flush LSN, even though that might include a partial record at the end of the range. In that case, read_local_xlog_page_no_wait() would return NULL when it tried to read past the flush LSN, which would be interpreted as an error by the caller. That caused a test failure only on a BF animal that had been restarted recently, but could be expected to happen in the wild quite easily depending on the alignment of various parameters. Fix by using private data in read_local_xlog_page_no_wait() to signal end-of-wal to the caller, so that it can be properly distinguished from a real error. Discussion: https://postgr.es/m/Ymd/e5eeZMNAkrXo%40paquier.xyz Discussion: https://postgr.es/m/111657.1650910309@sss.pgh.pa.us Authors: Thomas Munro, Bharath Rupireddy.	2022-04-30 09:05:32 -07:00
Tom Lane	2cb1272445	Rethink method for assigning OIDs to the template0 and postgres DBs. Commit `aa0105141` assigned fixed OIDs to template0 and postgres in a very ad-hoc way. Notably, instead of teaching Catalog.pm about these OIDs, the unused_oids script was just hacked to not show them as unused. That's problematic since, for example, duplicate_oids wouldn't report any future conflict. Hence, invent a macro DECLARE_OID_DEFINING_MACRO() that can be used to define an OID that is known to Catalog.pm and will participate in duplicate-detection as well as renumbering by renumber_oids.pl. (We don't anticipate renumbering these particular OIDs, but we might as well build out all the Catalog.pm infrastructure while we're here.) Another issue is that `aa0105141` neglected to touch IsPinnedObject, with the result that it now claimed template0 and postgres are pinned. The right thing to do there seems to be to teach it that no database is pinned, since in fact DROP DATABASE doesn't check for pinned-ness (and at least for these cases, that is an intentional choice). It's not clear whether this wrong answer had any visible effect, but perhaps it could have resulted in erroneous management of dependency entries. In passing, rename the TemplateDbOid macro to Template1DbOid to reduce confusion (likely we should have done that way back when we invented template0, but we didn't), and rename the OID macros for template0 and postgres to have a similar style. There are no changes to postgres.bki here, so no need for a catversion bump. Discussion: https://postgr.es/m/2935358.1650479692@sss.pgh.pa.us	2022-04-21 16:23:15 -04:00
Tom Lane	7b7ed046cb	Prevent access to no-longer-pinned buffer in heapam_tuple_lock(). heap_fetch() used to have a "keep_buf" parameter that told it to return ownership of the buffer pin to the caller after finding that the requested tuple TID exists but is invisible to the specified snapshot. This was thoughtlessly removed in commit `5db6df0c0`, which broke heapam_tuple_lock() (formerly EvalPlanQualFetch) because that function needs to do more accesses to the tuple even if it's invisible. The net effect is that we would continue to touch the page for a microsecond or two after releasing pin on the buffer. Usually no harm would result; but if a different session decided to defragment the page concurrently, we could see garbage data and mistakenly conclude that there's no newer tuple version to chain up to. (It's hard to say whether this has happened in the field. The bug was actually found thanks to a later change that allowed valgrind to detect accesses to non-pinned buffers.) The most reasonable way to fix this is to reintroduce keep_buf, although I made it behave slightly differently: buffer ownership is passed back only if there is a valid tuple at the requested TID. In HEAD, we can just add the parameter back to heap_fetch(). To avoid an API break in the back branches, introduce an additional function heap_fetch_extended() in those branches. In HEAD there is an additional, less obvious API change: tuple->t_data will be set to NULL in all cases where buffer ownership is not returned, in particular when the tuple exists but fails the time qual (and !keep_buf). This is to defend against any other callers attempting to access non-pinned buffers. We concluded that making that change in back branches would be more likely to introduce problems than cure any. In passing, remove a comment about heap_fetch that was obsoleted by `9a8ee1dc6`. Per bug #17462 from Daniil Anisimov. Back-patch to v12 where the bug was introduced. Discussion: https://postgr.es/m/17462-9c98a0f00df9bd36@postgresql.org	2022-04-13 13:35:07 -04:00
Tom Lane	bd037dc928	Make XLogRecGetBlockTag() throw error if there's no such block. All but a few existing callers assume without checking that this function succeeds. While it probably will, that's a poor excuse for not checking. Let's make it return void and instead throw an error if it doesn't find the block reference. Callers that actually need to handle the no-such-block case must now use the underlying function XLogRecGetBlockTagExtended. In addition to being a bit less error-prone, this should also serve to suppress some Coverity complaints about XLogRecGetBlockRefInfo. While at it, clean up some inconsistency about use of the XLogRecHasBlockRef macro: make XLogRecGetBlockTagExtended use that instead of open-coding the same condition, and avoid calling XLogRecHasBlockRef twice in relevant code paths. (That is, calling XLogRecHasBlockRef followed by XLogRecGetBlockTag is now deprecated: use XLogRecGetBlockTagExtended instead.) Patch HEAD only; this doesn't seem to have enough value to consider a back-branch API break. Discussion: https://postgr.es/m/425039.1649701221@sss.pgh.pa.us	2022-04-11 17:43:53 -04:00
Robert Haas	8ec569479f	Apply PGDLLIMPORT markings broadly. Up until now, we've had a policy of only marking certain variables in the PostgreSQL header files with PGDLLIMPORT, but now we've decided to mark them all. This means that extensions running on Windows should no longer operate at a disadvantage as compared to extensions running on Linux: if the variable is present in a header file, it should be accessible. Discussion: http://postgr.es/m/CA+TgmoYanc1_FSfimhgiWSqVyP5KKmh5NP2BWNwDhO8Pg2vGYQ@mail.gmail.com	2022-04-08 08:16:38 -04:00
Jeff Davis	1562e92c62	Fix buildfarm failure from commit `2258e76f90`.	2022-04-08 01:33:58 -07:00
Jeff Davis	2258e76f90	Add contrib/pg_walinspect. Provides similar functionality to pg_waldump, but from a SQL interface rather than a separate utility. Author: Bharath Rupireddy Reviewed-by: Greg Stark, Kyotaro Horiguchi, Andres Freund, Ashutosh Sharma, Nitin Jadhav, RKN Sai Krishna Discussion: https://postgr.es/m/CALj2ACUGUYXsEQdKhEdsBzhGEyF3xggvLdD8C0VT72TNEfOiog%40mail.gmail.com	2022-04-08 00:26:44 -07:00
Tomas Vondra	2c7ea57e56	Revert "Logical decoding of sequences" This reverts a sequence of commits, implementing features related to logical decoding and replication of sequences: - `0da92dc530` - `80901b3291` - `b779d7d8fd` - `d5ed9da41d` - `a180c2b34d` - `75b1521dae` - `2d2232933b` - `002c9dd97a` - `05843b1aa4` The implementation has issues, mostly due to combining transactional and non-transactional behavior of sequences. It's not clear how this could be fixed, but it'll require reworking significant part of the patch. Discussion: https://postgr.es/m/95345a19-d508-63d1-860a-f5c2f41e8d40@enterprisedb.com	2022-04-07 20:06:36 +02:00
Jeff Davis	957aa4d87a	Fix another buildfarm issue from commit `5c279a6d35`. Discussion: https://postgr.es/m/3280724.1649340315@sss.pgh.pa.us	2022-04-07 08:40:54 -07:00
Thomas Munro	5b186308fb	Include some missing headers. Per headerscheck on BF animal crake, and Andres. Discussion: https://postgr.es/m/20220407083630.n62vgwqfy2v6wsrd%40alap3.anarazel.de	2022-04-07 20:55:16 +12:00
Thomas Munro	5dc0418fab	Prefetch data referenced by the WAL, take II. Introduce a new GUC recovery_prefetch. When enabled, look ahead in the WAL and try to initiate asynchronous reading of referenced data blocks that are not yet cached in our buffer pool. For now, this is done with posix_fadvise(), which has several caveats. Since not all OSes have that system call, "try" is provided so that it can be enabled where available. Better mechanisms for asynchronous I/O are possible in later work. Set to "try" for now for test coverage. Default setting to be finalized before release. The GUC wal_decode_buffer_size limits the distance we can look ahead in bytes of decoded data. The existing GUC maintenance_io_concurrency is used to limit the number of concurrent I/Os allowed, based on pessimistic heuristics used to infer that I/Os have begun and completed. We'll also not look more than maintenance_io_concurrency * 4 block references ahead. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> (earlier version) Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version) Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> (earlier version) Tested-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> (earlier version) Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> (earlier version) Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> (earlier version) Tested-by: Sait Talha Nisanci <Sait.Nisanci@microsoft.com> (earlier version) Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com	2022-04-07 19:42:14 +12:00
Jeff Davis	9553b4115f	Fix warning introduced in `5c279a6d35`. Change two macros to be static inline functions instead to keep the data type consistent. This avoids a "comparison is always true" warning that was occurring with -Wtype-limits. In the process, change the names to look less like macros. Discussion: https://postgr.es/m/20220407063505.njnnrmbn4sxqfsts@alap3.anarazel.de	2022-04-07 00:39:30 -07:00
Jeff Davis	5c279a6d35	Custom WAL Resource Managers. Allow extensions to specify a new custom resource manager (rmgr), which allows specialized WAL. This is meant to be used by a Table Access Method or Index Access Method. Prior to this commit, only Generic WAL was available, which offers support for recovery and physical replication but not logical replication. Reviewed-by: Julien Rouhaud, Bharath Rupireddy, Andres Freund Discussion: https://postgr.es/m/ed1fb2e22d15d3563ae0eb610f7b61bb15999c0a.camel%40j-davis.com	2022-04-06 23:06:46 -07:00
Andres Freund	8b1dccd37c	pgstat: scaffolding for transactional stats creation / drop. One problematic part of the current statistics collector design is that there is no reliable way of getting rid of statistics entries. Because of that pgstat_vacuum_stat() (called by [auto-]vacuum) matches all stats for the current database with the catalog contents and tries to drop now-superfluous entries. That's quite expensive. What's worse, it doesn't work on physical replicas, despite physical replicas collection statistics entries. This commit introduces infrastructure to create / drop statistics entries transactionally, together with the underlying catalog objects (functions, relations, subscriptions). pgstat_xact.c maintains a list of stats entries created / dropped transactionally in the current transaction. To ensure the removal of statistics entries is durable dropped statistics entries are included in commit / abort (and prepare) records, which also ensures that stats entries are dropped on standbys. Statistics entries created separately from creating the underlying catalog object (e.g. when stats were previously lost due to an immediate restart) are not WAL logged. However that can only happen outside of the transaction creating the catalog object, so it does not lead to "leaked" statistics entries. For this to work, functions creating / dropping functions / relations / subscriptions need to call into pgstat. For subscriptions this was already done when dropping subscriptions, via pgstat_report_subscription_drop() (now renamed to pgstat_drop_subscription()). This commit does not actually drop stats yet, it just provides the infrastructure. It is however a largely independent piece of infrastructure, so committing it separately makes sense. Bumps XLOG_PAGE_MAGIC. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de	2022-04-06 18:27:52 -07:00
Stephen Frost	39969e2a1e	Remove exclusive backup mode Exclusive-mode backups have been deprecated since 9.6 (when non-exclusive backups were introduced) due to the issues they can cause should the system crash while one is running and generally because non-exclusive provides a much better interface. Further, exclusive backup mode wasn't really being tested (nor was most of the related code- like being able to log in just to stop an exclusive backup and the bits of the state machine related to that) and having to possibly deal with an exclusive backup and the backup_label file existing during pg_basebackup, pg_rewind, etc, added other complexities that we are better off without. This patch removes the exclusive backup mode, the various special cases for dealing with it, and greatly simplifies the online backup code and documentation. Authors: David Steele, Nathan Bossart Reviewed-by: Chapman Flack Discussion: https://postgr.es/m/ac7339ca-3718-3c93-929f-99e725d1172c@pgmasters.net https://postgr.es/m/CAHg+QDfiM+WU61tF6=nPZocMZvHDzCK47Kneyb0ZRULYzV5sKQ@mail.gmail.com	2022-04-06 14:41:03 -04:00

1 2 3 4 5 ...

1663 commits