postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-02-27 03:40:28 -05:00

Author	SHA1	Message	Date
Noah Misch	083929372e	For PowerPC instruction "addi", use constraint "b". Without "b", a variant of the tas() code miscompiles on macOS 10.4. This may also fix a compilation failure involving macOS 10.1. Today's compilers have been allocating acceptable registers with or without this change, but this future-proofs the code by precisely conveying the acceptable registers. Back-patch to 9.4 (all supported versions). Reviewed by Tom Lane. Discussion: https://postgr.es/m/20191009063900.GA4066266@rfd.leadboat.com	2019-10-18 20:20:32 -07:00
Tom Lane	05399b1488	Fix spinlock assembly code for MIPS so it works on MIPS r6. Original MIPS-I processors didn't have the LL/SC instructions (nor any other userland synchronization primitive). If the build toolchain targets that ISA variant by default, as an astonishingly large fraction of MIPS platforms still do, the assembler won't take LL/SC without coercion in the form of a ".set mips2" instruction. But we issued that unconditionally, making it an ISA downgrade for chips later than MIPS2. That breaks things for the latest MIPS r6 ISA, which encodes these instructions differently. Adjust the code so we don't change ISA level if it's >= 2. Note that this patch doesn't change what happens on an actual MIPS-I processor: either the kernel will emulate these instructions transparently, or you'll get a SIGILL failure. That tradeoff seemed fine in 2002 when this code was added (cf `3cbe6b247`), and it's even more so today when MIPS-I is basically extinct. But let's add a comment about that. YunQiang Su (with cosmetic adjustments by me). Back-patch to all supported branches. Discussion: https://postgr.es/m/15844-8f62fe7e163939b3@postgresql.org	2019-06-22 20:31:50 -04:00
Noah Misch	6d81e3c652	Consistently test for in-use shared memory. postmaster startup scrutinizes any shared memory segment recorded in postmaster.pid, exiting if that segment matches the current data directory and has an attached process. When the postmaster.pid file was missing, a starting postmaster used weaker checks. Change to use the same checks in both scenarios. This increases the chance of a startup failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1 postmaster.pid` && rm postmaster.pid && pg_ctl -w start". A postmaster will no longer stop if shmat() of an old segment fails with EACCES. A postmaster will no longer recycle segments pertaining to other data directories. That's good for production, but it's bad for integration tests that crash a postmaster and immediately delete its data directory. Such a test now leaks a segment indefinitely. No "make check-world" test does that. win32_shmem.c already avoided all these problems. In 9.6 and later, enhance PostgresNode to facilitate testing. Back-patch to 9.4 (all supported versions). Reviewed (in earlier versions) by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20190408064141.GA2016666@rfd.leadboat.com	2019-04-12 22:36:42 -07:00
Noah Misch	f5989b379c	Avoid "could not reattach" by providing space for concurrent allocation. We've long had reports of intermittent "could not reattach to shared memory" errors on Windows. Buildfarm member dory fails that way when PGSharedMemoryReAttach() execution overlaps with creation of a thread for the process's "default thread pool". Fix that by providing a second region to receive asynchronous allocations that would otherwise intrude into UsedShmemSegAddr. In pgwin32_ReserveSharedMemoryRegion(), stop trying to free reservations landing at incorrect addresses; the caller's next step has been to terminate the affected process. Back-patch to 9.4 (all supported versions). Reviewed by Tom Lane. He also did much of the prerequisite research; see commit `bcbf2346d6`. Discussion: https://postgr.es/m/20190402135442.GA1173872@rfd.leadboat.com	2019-04-08 21:39:04 -07:00
Noah Misch	7d18a55c90	Revert "Consistently test for in-use shared memory." This reverts commits `2f932f71d9`, `16ee6eaf80` and `6f0e190056`. The buildfarm has revealed several bugs. Back-patch like the original commits. Discussion: https://postgr.es/m/20190404145319.GA1720877@rfd.leadboat.com	2019-04-05 00:00:55 -07:00
Noah Misch	7c414cdc39	Consistently test for in-use shared memory. postmaster startup scrutinizes any shared memory segment recorded in postmaster.pid, exiting if that segment matches the current data directory and has an attached process. When the postmaster.pid file was missing, a starting postmaster used weaker checks. Change to use the same checks in both scenarios. This increases the chance of a startup failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1 postmaster.pid` && rm postmaster.pid && pg_ctl -w start". A postmaster will no longer recycle segments pertaining to other data directories. That's good for production, but it's bad for integration tests that crash a postmaster and immediately delete its data directory. Such a test now leaks a segment indefinitely. No "make check-world" test does that. win32_shmem.c already avoided all these problems. In 9.6 and later, enhance PostgresNode to facilitate testing. Back-patch to 9.4 (all supported versions). Reviewed by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20130911033341.GD225735@tornado.leadboat.com	2019-04-03 17:03:50 -07:00
Tomas Vondra	5601f9fb97	Track unowned relations in doubly-linked list Relations dropped in a single transaction are tracked in a list of unowned relations. With large number of dropped relations this resulted in poor performance at the end of a transaction, when the relations are removed from the singly linked list one by one. Commit `b4166911` attempted to address this issue (particularly when it happens during recovery) by removing the relations in a reverse order, resulting in O(1) lookups in the list of unowned relations. This did not work reliably, though, and it was possible to trigger the O(N^2) behavior in various ways. Instead of trying to remove the relations in a specific order with respect to the linked list, which seems rather fragile, switch to a regular doubly linked. That allows us to remove relations cheaply no matter where in the list they are. As `b4166911` was a bugfix, backpatched to all supported versions, do the same thing here. Reviewed-by: Alvaro Herrera Discussion: https://www.postgresql.org/message-id/flat/80c27103-99e4-1d0c-642c-d9f3b94aaa0a%402ndquadrant.com Backpatch-through: 9.4	2019-03-27 03:19:33 +01:00
Thomas Munro	afbe03f654	PANIC on fsync() failure. On some operating systems, it doesn't make sense to retry fsync(), because dirty data cached by the kernel may have been dropped on write-back failure. In that case the only remaining copy of the data is in the WAL. A subsequent fsync() could appear to succeed, but not have flushed the data. That means that a future checkpoint could apparently complete successfully but have lost data. Therefore, violently prevent any future checkpoint attempts by panicking on the first fsync() failure. Note that we already did the same for WAL data; this change extends that behavior to non-temporary data files. Provide a GUC data_sync_retry to control this new behavior, for users of operating systems that don't eject dirty data, and possibly forensic/testing uses. If it is set to on and the write-back error was transient, a later checkpoint might genuinely succeed (on a system that does not throw away buffers on failure); if the error is permanent, later checkpoints will continue to fail. The GUC defaults to off, meaning that we panic. Back-patch to all supported releases. There is still a narrow window for error-loss on some operating systems: if the file is closed and later reopened and a write-back error occurs in the intervening time, but the inode has the bad luck to be evicted due to memory pressure before we reopen, we could miss the error. A later patch will address that with a scheme for keeping files with dirty data open at all times, but we judge that to be too complicated to back-patch. Author: Craig Ringer, with some adjustments by Thomas Munro Reported-by: Craig Ringer Reviewed-by: Robert Haas, Thomas Munro, Andres Freund Discussion: https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de	2018-11-19 13:40:57 +13:00
Tom Lane	9e6f4fbdd0	Fix longstanding recursion hazard in sinval message processing. LockRelationOid and sibling routines supposed that, if our session already holds the lock they were asked to acquire, they could skip calling AcceptInvalidationMessages on the grounds that we must have already read any remote sinval messages issued against the relation being locked. This is normally true, but there's a critical special case where it's not: processing inside AcceptInvalidationMessages might attempt to access system relations, resulting in a recursive call to acquire a relation lock. Hence, if the outer call had acquired that same system catalog lock, we'd fall through, despite the possibility that there's an as-yet-unread sinval message for that system catalog. This could, for example, result in failure to access a system catalog or index that had just been processed by VACUUM FULL. This is the explanation for buildfarm failures we've been seeing intermittently for the past three months. The bug is far older than that, but commits `a54e1f158` et al added a new recursion case within AcceptInvalidationMessages that is apparently easier to hit than any previous case. To fix this, we must not skip calling AcceptInvalidationMessages until we have finished a call to it since acquiring a relation lock, not merely acquired the lock. (There's already adequate logic inside AcceptInvalidationMessages to deal with being called recursively.) Fortunately, we can implement that at trivial cost, by adding a flag to LOCALLOCK hashtable entries that tracks whether we know we have completed such a call. There is an API hazard added by this patch for external callers of LockAcquire: if anything is testing for LOCKACQUIRE_ALREADY_HELD, it might be fooled by the new return code LOCKACQUIRE_ALREADY_CLEAR into thinking the lock wasn't already held. This should be a fail-soft condition, though, unless something very bizarre is being done in response to the test. Also, I added an additional output argument to LockAcquireExtended, assuming that that probably isn't called by any outside code given the very limited usefulness of its additional functionality. Back-patch to all supported branches. Discussion: https://postgr.es/m/12259.1532117714@sss.pgh.pa.us	2018-09-07 18:04:56 -04:00
Tom Lane	c2dfbd18ce	Make checksum_impl.h safe to compile with -fstrict-aliasing. In general, Postgres requires -fno-strict-aliasing with compilers that implement C99 strict aliasing rules. There's little hope of getting rid of that overall. But it seems like it would be a good idea if storage/checksum_impl.h in particular didn't depend on it, because that header is explicitly intended to be included by external programs. We don't have a lot of control over the compiler switches that an external program might use, as shown by Michael Banck's report of failure in a privately-modified version of pg_verify_checksums. Hence, switch to using a union in place of willy-nilly pointer casting inside this file. I think this makes the code a bit more readable anyway. checksum_impl.h hasn't changed since it was introduced in 9.3, so back-patch all the way. Discussion: https://postgr.es/m/1535618100.1286.3.camel@credativ.de	2018-08-31 12:26:47 -04:00
Heikki Linnakangas	ed529faf7e	Fix misc typos, mostly in comments. A collection of typos I happened to spot while reading code, as well as grepping for common mistakes. Backpatch to all supported versions, as applicable, to avoid conflicts when backporting other commits in the future.	2018-07-18 16:18:27 +03:00
Fujii Masao	8463be060d	Improve the performance of relation deletes during recovery. When multiple relations are deleted at the same transaction, the files of those relations are deleted by one call to smgrdounlinkall(), which leads to scan whole shared_buffers only one time. OTOH, previously, during recovery, smgrdounlink() (not smgrdounlinkall()) was called for each file to delete, which led to scan shared_buffers multiple times. Obviously this could cause to increase the WAL replay time very much especially when shared_buffers was huge. To alleviate this situation, this commit changes the recovery so that it also calls smgrdounlinkall() only one time to delete multiple relation files. This is just fix for oversight of commit `279628a0a7`, not new feature. So, per discussion on pgsql-hackers, we concluded to backpatch this to all supported versions. Author: Fujii Masao Reviewed-by: Michael Paquier, Andres Freund, Thomas Munro, Kyotaro Horiguchi, Takayuki Tsunakawa Discussion: https://postgr.es/m/CAHGQGwHVQkdfDqtvGVkty+19cQakAydXn1etGND3X0PHbZ3+6w@mail.gmail.com	2018-07-05 02:26:22 +09:00
Andres Freund	2ce64caaf7	Fix bugs in vacuum of shared rels, by keeping their relcache entries current. When vacuum processes a relation it uses the corresponding relcache entry's relfrozenxid / relminmxid as a cutoff for when to remove tuples etc. Unfortunately for nailed relations (i.e. critical system catalogs) bugs could frequently lead to the corresponding relcache entry being stale. This set of bugs could cause actual data corruption as vacuum would potentially not remove the correct row versions, potentially reviving them at a later point. After `699bf7d05c` some corruptions in this vein were prevented, but the additional error checks could also trigger spuriously. Examples of such errors are: ERROR: found xmin ... from before relfrozenxid ... and ERROR: found multixact ... from before relminmxid ... To be caused by this bug the errors have to occur on system catalog tables. The two bugs are: 1) Invalidations for nailed relations were ignored, based on the theory that the relcache entry for such tables doesn't change. Which is largely true, except for fields like relfrozenxid etc. This means that changes to relations vacuumed in other sessions weren't picked up by already existing sessions. Luckily autovacuum doesn't have particularly longrunning sessions. 2) For shared and nailed relations, the shared relcache init file was never invalidated while running. That means that for such tables (e.g. pg_authid, pg_database) it's not just already existing sessions that are affected, but even new connections are as well. That explains why the reports usually were about pg_authid et. al. To fix 1), revalidate the rd_rel portion of a relcache entry when invalid. This implies a bit of extra complexity to deal with bootstrapping, but it's not too bad. The fix for 2) is simpler, simply always remove both the shared and local init files. Author: Andres Freund Reviewed-By: Alvaro Herrera Discussion: https://postgr.es/m/20180525203736.crkbg36muzxrjj5e@alap3.anarazel.de https://postgr.es/m/CAMa1XUhKSJd98JW4o9StWPrfS=11bPgG+_GDMxe25TvUY4Sugg@mail.gmail.com https://postgr.es/m/CAKMFJucqbuoDRfxPDX39WhA3vJyxweRg_zDVXzncr6+5wOguWA@mail.gmail.com https://postgr.es/m/CAGewt-ujGpMLQ09gXcUFMZaZsGJC98VXHEFbF-tpPB0fB13K+A@mail.gmail.com Backpatch: 9.3-	2018-06-12 11:13:21 -07:00
Robert Haas	ff1473078d	Mark assorted variables PGDLLIMPORT. This makes life easier for extension authors who wish to support Windows. Brian Cloutier, slightly amended by me. Discussion: http://postgr.es/m/CAJCy68fscdNhmzFPS4kyO00CADkvXvEa-28H-OtENk-pa2OTWw@mail.gmail.com	2017-12-05 09:24:05 -05:00
Tom Lane	2a11b188e8	Clean up assorted messiness around AllocateDir() usage. This patch fixes a couple of low-probability bugs that could lead to reporting an irrelevant errno value (and hence possibly a wrong SQLSTATE) concerning directory-open or file-open failures. It also fixes places where we took shortcuts in reporting such errors, either by using elog instead of ereport or by using ereport but forgetting to specify an errcode. And it eliminates a lot of just plain redundant error-handling code. In service of all this, export fd.c's formerly-static function ReadDirExtended, so that external callers can make use of the coding pattern dir = AllocateDir(path); while ((de = ReadDirExtended(dir, path, LOG)) != NULL) if they'd like to treat directory-open failures as mere LOG conditions rather than errors. Also fix FreeDir to be a no-op if we reach it with dir == NULL, as such a coding pattern would cause. Then, remove code at many call sites that was throwing an error or log message for AllocateDir failure, as ReadDir or ReadDirExtended can handle that job just fine. Aside from being a net code savings, this gets rid of a lot of not-quite-up-to-snuff reports, as mentioned above. (In some places these changes result in replacing a custom error message such as "could not open tablespace directory" with more generic wording "could not open directory", but it was agreed that the custom wording buys little as long as we report the directory name.) In some other call sites where we can't just remove code, change the error reports to be fully project-style-compliant. Also reorder code in restoreTwoPhaseData that was acquiring a lock between AllocateDir and ReadDir; in the unlikely but surely not impossible case that LWLockAcquire changes errno, AllocateDir failures would be misreported. There is no great value in opening the directory before acquiring TwoPhaseStateLock, so just do it in the other order. Also fix CheckXLogRemoved to guarantee that it preserves errno, as quite a number of call sites are implicitly assuming. (Again, it's unlikely but I think not impossible that errno could change during a SpinLockAcquire. If so, this function was broken for its own purposes as well as breaking callers.) And change a few places that were using not-per-project-style messages, such as "could not read directory" when "could not open directory" is more correct. Back-patch the exporting of ReadDirExtended, in case we have occasion to back-patch some fix that makes use of it; it's not needed right now but surely making it global is pretty harmless. Also back-patch the restoreTwoPhaseData and CheckXLogRemoved fixes. The rest of this is essentially cosmetic and need not get back-patched. Michael Paquier, with a bit of additional work by me Discussion: https://postgr.es/m/CAB7nPqRpOCxjiirHmebEFhXVTK7V5Jvw4bz82p7Oimtsm3TyZA@mail.gmail.com	2017-12-04 17:02:52 -05:00
Tom Lane	1ecf7eeb87	Add support for Motorola 88K to s_lock.h. Apparently there are still people out there who care about this old architecture. They probably care about dusty versions of Postgres too, so back-patch to all supported branches. David Carlier (from a patch being carried by OpenBSD packagers) Discussion: https://postgr.es/m/CA+XhMqzwFSGVU7MEnfhCecc8YdP98tigXzzpd0AAdwaGwaVXEA@mail.gmail.com	2017-11-20 17:57:46 -05:00
Tom Lane	b4fa938e9f	Clean up shm_mq cleanup. The logic around shm_mq_detach was a few bricks shy of a load, because (contrary to the comments for shm_mq_attach) all it did was update the shared shm_mq state. That left us leaking a bit of process-local memory, but much worse, the on_dsm_detach callback for shm_mq_detach was still armed. That means that whenever we ultimately detach from the DSM segment, we'd run shm_mq_detach again for already-detached, possibly long-dead queues. This accidentally fails to fail today, because we only ever re-use a shm_mq's memory for another shm_mq, and multiple detach attempts on the last such shm_mq are fairly harmless. But it's gonna bite us someday, so let's clean it up. To do that, change shm_mq_detach's API so it takes a shm_mq_handle not the underlying shm_mq. This makes the callers simpler in most cases anyway. Also fix a few places in parallel.c that were just pfree'ing the handle structs rather than doing proper cleanup. Back-patch to v10 because of the risk that the revenant shm_mq_detach callbacks would cause a live bug sometime. Since this is an API change, it's too late to do it in 9.6. (We could make a variant patch that preserves API, but I'm not excited enough to do that.) Discussion: https://postgr.es/m/8670.1504192177@sss.pgh.pa.us	2017-08-31 15:10:24 -04:00
Tom Lane	d7ab908fba	Distinguish wait-for-connection from wait-for-write-ready on Windows. The API for WaitLatch and friends followed the Unix convention in which waiting for a socket connection to complete is identical to waiting for the socket to accept a write. While Windows provides a select(2) emulation that agrees with that, the native WaitForMultipleObjects API treats them as quite different --- and for some bizarre reason, it will report a not-yet-connected socket as write-ready. libpq itself has so far escaped dealing with this because it waits with select(), but in libpqwalreceiver.c we want to wait using WaitLatchOrSocket. The semantics mismatch resulted in replication connection failures on Windows, but only for remote connections (apparently, localhost connections complete immediately, or at least too fast for anyone to have noticed the problem in single-machine testing). To fix, introduce an additional WL_SOCKET_CONNECTED wait flag for WaitLatchOrSocket, which is identical to WL_SOCKET_WRITEABLE on non-Windows, but results in waiting for FD_CONNECT events on Windows. Ideally, we would also distinguish the two conditions in the API for PQconnectPoll(), but changing that API at this point seems infeasible. Instead, cheat by checking for PQstatus() == CONNECTION_STARTED to determine that we're still waiting for the connection to complete. (This is a cheat mainly because CONNECTION_STARTED is documented as an internal state rather than something callers should rely on. Perhaps we ought to change the documentation ... but this patch doesn't.) Per reports from Jobin Augustine and Igor Neyman. Back-patch to v10 where commit `1e8a85009` exposed this longstanding shortcoming. Andres Freund, minor fix and some code review/beautification by me Discussion: https://postgr.es/m/CAHBggj8g2T+ZDcACZ2FmzX9CTxkWjKBsHd6NkYB4i9Ojf6K1Fw@mail.gmail.com	2017-08-15 11:08:25 -04:00
Tom Lane	382ceffdf7	Phase 3 of pgindent updates. Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:35:54 -04:00
Tom Lane	c7b8998ebb	Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit `e3860ffa4d` wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:19:25 -04:00
Tom Lane	e3860ffa4d	Initial pgindent run with pg_bsd_indent version 2.0. The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname varname", as well as some similar cases for const- or volatile-qualified pointers. Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 14:39:04 -04:00
Andres Freund	6c2003f8a1	Don't force-assign transaction id when exporting a snapshot. Previously we required every exported transaction to have an xid assigned. That was used to check that the exporting transaction is still running, which in turn is needed to guarantee that that necessary rows haven't been removed in between exporting and importing the snapshot. The exported xid caused unnecessary problems with logical decoding, because slot creation has to wait for all concurrent xid to finish, which in turn serializes concurrent slot creation. It also prohibited snapshots to be exported on hot-standby replicas. Instead export the virtual transactionid, which avoids the unnecessary serialization and the inability to export snapshots on standbys. This changes the file name of the exported snapshot, but since we never documented what that one means, that seems ok. Author: Petr Jelinek, slightly editorialized by me Reviewed-By: Andres Freund Discussion: https://postgr.es/m/f598b4b8-8cd7-0d54-0939-adda763d8c34@2ndquadrant.com	2017-06-14 11:57:21 -07:00
Tom Lane	651902deb1	Re-run pgindent. This is just to have a clean base state for testing of Piotr Stefaniak's latest version of FreeBSD indent. I fixed up a couple of places where pgindent would have changed format not-nicely. perltidy not included. Discussion: https://postgr.es/m/VI1PR03MB119959F4B65F000CA7CD9F6BF2CC0@VI1PR03MB1199.eurprd03.prod.outlook.com	2017-06-13 13:05:59 -04:00
Andres Freund	c6c3334364	Prevent possibility of panics during shutdown checkpoint. When the checkpointer writes the shutdown checkpoint, it checks afterwards whether any WAL has been written since it started and throws a PANIC if so. At that point, only walsenders are still active, so one might think this could not happen, but walsenders can also generate WAL, for instance in BASE_BACKUP and logical decoding related commands (e.g. via hint bits). So they can trigger this panic if such a command is run while the shutdown checkpoint is being written. To fix this, divide the walsender shutdown into two phases. First, checkpointer, itself triggered by postmaster, sends a PROCSIG_WALSND_INIT_STOPPING signal to all walsenders. If the backend is idle or runs an SQL query this causes the backend to shutdown, if logical replication is in progress all existing WAL records are processed followed by a shutdown. Otherwise this causes the walsender to switch to the "stopping" state. In this state, the walsender will reject any further replication commands. The checkpointer begins the shutdown checkpoint once all walsenders are confirmed as stopping. When the shutdown checkpoint finishes, the postmaster sends us SIGUSR2. This instructs walsender to send any outstanding WAL, including the shutdown checkpoint record, wait for it to be replicated to the standby, and then exit. Author: Andres Freund, based on an earlier patch by Michael Paquier Reported-By: Fujii Masao, Andres Freund Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20170602002912.tqlwn4gymzlxpvs2@alap3.anarazel.de Backpatch: 9.4, where logical decoding was introduced	2017-06-05 19:18:15 -07:00
Tom Lane	3e60c6f723	Code review for shm_toc.h/.c. Declare the toc_nentry field as uint32 not Size. Since shm_toc_lookup() reads the field without any lock, it has to be atomically readable, and we do not assume that for fields wider than 32 bits. Performance would be impossibly bad for entry counts approaching 2^32 anyway, so there is no need to try to preserve maximum width here. This is probably an academic issue, because even if reading int64 isn't atomic, the high order half would never change in practice. Still, it's a coding rule violation, so let's fix it. Adjust some other not-terribly-well-chosen data types too, and copy-edit some comments. Make shm_toc_attach's Asserts consistent with shm_toc_create's. None of this looks to be a live bug, so no need for back-patch. Discussion: https://postgr.es/m/16984.1496679541@sss.pgh.pa.us	2017-06-05 14:50:59 -04:00
Tom Lane	d466335064	Don't be so trusting that shm_toc_lookup() will always succeed. Given the possibility of race conditions and so on, it seems entirely unsafe to just assume that shm_toc_lookup() always finds the key it's looking for --- but that was exactly what all but one call site were doing. To fix, add a "bool noError" argument, similarly to what we have in many other functions, and throw an error on an unexpected lookup failure. Remove now-redundant Asserts that a rather random subset of call sites had. I doubt this will throw any light on buildfarm member lorikeet's recent failures, because if an unnoticed lookup failure were involved, you'd kind of expect a null-pointer-dereference crash rather than the observed symptom. But you never know ... and this is better coding practice even if it never catches anything. Discussion: https://postgr.es/m/9697.1496675981@sss.pgh.pa.us	2017-06-05 12:05:42 -04:00
Bruce Momjian	a6fd7b7a5f	Post-PG 10 beta1 pgindent run perltidy run not included.	2017-05-17 16:31:56 -04:00
Andres Freund	2bef06d516	Preserve required !catalog tuples while computing initial decoding snapshot. The logical decoding machinery already preserved all the required catalog tuples, which is sufficient in the course of normal logical decoding, but did not guarantee that non-catalog tuples were preserved during computation of the initial snapshot when creating a slot over the replication protocol. This could cause a corrupted initial snapshot being exported. The time window for issues is usually not terribly large, but on a busy server it's perfectly possible to it hit it. Ongoing decoding is not affected by this bug. To avoid increased overhead for the SQL API, only retain additional tuples when a logical slot is being created over the replication protocol. To do so this commit changes the signature of CreateInitDecodingContext(), but it seems unlikely that it's being used in an extension, so that's probably ok. In a drive-by fix, fix handling of ReplicationSlotsComputeRequiredXmin's already_locked argument, which should only apply to ProcArrayLock, not ReplicationSlotControlLock. Reported-By: Erik Rijkers Analyzed-By: Petr Jelinek Author: Petr Jelinek, heavily editorialized by Andres Freund Reviewed-By: Andres Freund Discussion: https://postgr.es/m/9a897b86-46e1-9915-ee4c-da02e4ff6a95@2ndquadrant.com Backport: 9.4, where logical decoding was introduced.	2017-04-27 13:13:36 -07:00
Magnus Hagander	a4777f3556	Remove symbol WIN32_ONLY_COMPILER This used to mean "Visual C++ except in those parts where Borland C++ was supported where it meant one of those". Now that we don't support Borland C++ anymore, simplify by using _MSC_VER which is the normal way to detect Visual C++.	2017-04-11 15:22:21 +02:00
Tom Lane	511540dadf	Move isolationtester's is-blocked query into C code for speed. Commit `4deb41381` modified isolationtester's query to see whether a session is blocked to also check for waits occurring in GetSafeSnapshot. However, it did that in a way that enormously increased the query's runtime under CLOBBER_CACHE_ALWAYS, causing the buildfarm members that use that to run about four times slower than before, and in some cases fail entirely. To fix, push the entire logic into a dedicated backend function. This should actually reduce the CLOBBER_CACHE_ALWAYS runtime from what it was previously, though I've not checked that. In passing, expose a SQL function to check for safe-snapshot blockage, comparable to pg_blocking_pids. This is more or less free given the infrastructure built to solve the other problem, so we might as well. Thomas Munro Discussion: https://postgr.es/m/20170407165749.pstcakbc637opkax@alap3.anarazel.de	2017-04-10 10:26:54 -04:00
Kevin Grittner	c63172d60f	Add GUCs for predicate lock promotion thresholds. Defaults match the fixed behavior of prior releases, but now DBAs have better options to tune serializable workloads. It might be nice to be able to set this per relation, but that part will need to wait for another release. Author: Dagfinn Ilmari Mannsåker	2017-04-07 21:38:05 -05:00
Alvaro Herrera	ce96ce60ca	Remove direct uses of ItemPointer.{ip_blkid,ip_posid} There are no functional changes here; this simply encapsulates knowledge of the ItemPointerData struct so that a future patch can change things without more breakage. All direct users of ip_blkid and ip_posid are changed to use existing macros ItemPointerGetBlockNumber and ItemPointerGetOffsetNumber respectively. For callers where that's inappropriate (because they Assert that the itempointer is is valid-looking), add ItemPointerGetBlockNumberNoCheck and ItemPointerGetOffsetNumberNoCheck, which lack the assertion but are otherwise identical. Author: Pavan Deolasee Discussion: https://postgr.es/m/CABOikdNnFon4cJiL=h1mZH3bgUeU+sWHuU4Yr8AB=j3A2p1GiA@mail.gmail.com	2017-03-28 19:02:23 -03:00
Teodor Sigaev	1b02be21f2	Fsync directory after creating or unlinking file. If file was created/deleted just before powerloss it's possible that file system will miss that. To prevent it, call fsync() where creating/ unlinkg file is critical. Author: Michael Paquier Reviewed-by: Ashutosh Bapat, Takayuki Tsunakawa, me	2017-03-27 19:33:01 +03:00
Robert Haas	fc70a4b0df	Show more processes in pg_stat_activity. Previously, auxiliary processes and background workers not connected to a database (such as the logical replication launcher) weren't shown. Include them, so that we can see the associated wait state information. Add a new column to identify the processes type, so that people can filter them out easily using SQL if they wish. Before this patch was written, there was discussion about whether we should expose this information in a separate view, so as to avoid contaminating pg_stat_activity with things people might not want to see. But putting everything in pg_stat_activity was a more popular choice, so that's what the patch does. Kuntal Ghosh, reviewed by Amit Langote and Michael Paquier. Some revisions and bug fixes by me. Discussion: http://postgr.es/m/CA+TgmoYES5nhkEGw9nZXU8_FhA8XEm8NTm3-SO+3ML1B81Hkww@mail.gmail.com	2017-03-26 22:02:22 -04:00
Simon Riggs	5737c12df0	Report catalog_xmin separately in hot_standby_feedback If the upstream walsender is using a physical replication slot, store the catalog_xmin in the slot's catalog_xmin field. If the upstream doesn't use a slot and has only a PGPROC entry behaviour doesn't change, as we store the combined xmin and catalog_xmin in the PGPROC entry. Author: Craig Ringer	2017-03-25 14:07:27 +00:00
Simon Riggs	af4b1a0869	Refactor GetOldestXmin() to use flags Replace ignoreVacuum parameter with more flexible flags. Author: Eiji Seki Review: Haribabu Kommi	2017-03-22 16:51:01 +00:00
Robert Haas	249cf070e3	Create and use wait events for read, write, and fsync operations. Previous commits, notably `53be0b1add` and `6f3bd98ebf`, made it possible to see from pg_stat_activity when a backend was stuck waiting for another backend, but it's also fairly common for a backend to be stuck waiting for an I/O. Add wait events for those operations, too. Rushabh Lathia, with further hacking by me. Reviewed and tested by Michael Paquier, Amit Kapila, Rajkumar Raghuwanshi, and Rahila Syed. Discussion: http://postgr.es/m/CAGPqQf0LsYHXREPAZqYGVkDqHSyjf=KsD=k0GTVPAuzyThh-VQ@mail.gmail.com	2017-03-18 07:43:01 -04:00
Robert Haas	87f9982034	Fix failure to mark init buffers as BM_PERMANENT. This could result in corruption of the init fork of an unlogged index if the ambuildempty routine for that index used shared buffers to create the init fork, which was true for brin, gin, gist, and hash indexes. Patch by me, based on an earlier patch by Michael Paquier, who also reviewed this one. This also incorporates an idea from Artur Zakirov. Discussion: http://postgr.es/m/CACYUyc8yccE4xfxhqxfh_Mh38j7dRFuxfaK1p6dSNAEUakxUyQ@mail.gmail.com	2017-03-14 11:51:11 -04:00
Robert Haas	390811750d	Revert "Use group updates when setting transaction status in clog." This reverts commit `ccce90b398`. This optimization is unsafe, at least, of rollbacks and rollbacks to savepoints, but I'm concerned there may be other problematic cases as well. Therefore, I've decided to revert this pending further investigation.	2017-03-10 14:49:56 -05:00
Robert Haas	ccce90b398	Use group updates when setting transaction status in clog. Commit `0e141c0fbb` introduced a mechanism to reduce contention on ProcArrayLock by having a single process clear XIDs in the procArray on behalf of multiple processes, reducing the need to hand the lock around. Use a similar mechanism to reduce contention on CLogControlLock. Testing shows that this very significantly reduces the amount of time waiting for CLogControlLock on high-concurrency pgbench tests run on a large multi-socket machines; whether that translates into a TPS improvement depends on how much of that contention is simply shifted to some other lock, particularly WALWriteLock. Amit Kapila, with some cosmetic changes by me. Extensively reviewed, tested, and benchmarked over a period of about 15 months by Simon Riggs, Robert Haas, Andres Freund, Jesper Pedersen, and especially by Tomas Vondra and Dilip Kumar. Discussion: http://postgr.es/m/CAA4eK1L_snxM_JcrzEstNq9P66++F4kKFce=1r5+D1vzPofdtg@mail.gmail.com Discussion: http://postgr.es/m/CAA4eK1LyR2A+m=RBSZ6rcPEwJ=rVi1ADPSndXHZdjn56yqO6Vg@mail.gmail.com Discussion: http://postgr.es/m/91d57161-d3ea-0cc2-6066-80713e4f90d7@2ndquadrant.com	2017-03-09 17:49:01 -05:00
Robert Haas	98e6e89040	tidbitmap: Support shared iteration. When a shared iterator is used, each call to tbm_shared_iterate() returns a result that has not yet been returned to any process attached to the shared iterator. In other words, each cooperating processes gets a disjoint subset of the full result set, but all results are returned exactly once. This is infrastructure for parallel bitmap heap scan. Dilip Kumar. The larger patch set of which this is a part has been reviewed and tested by (at least) Andres Freund, Amit Khandekar, Tushar Ahuja, Rafia Sabih, Haribabu Kommi, and Thomas Munro. Discussion: http://postgr.es/m/CAFiTN-uc4=0WxRGfCzs-xfkMYcSEWUC-Fon6thkJGjkh9i=13A@mail.gmail.com	2017-03-08 08:09:38 -05:00
Robert Haas	b0f18cb77f	hash: Refactor bucket squeeze code. In preparation for adding write-ahead logging to hash indexes, refactor _hash_freeovflpage and _hash_squeezebucket so that all related page modifications happen in a single section of code. The previous coding assumed that it would be fine to move tuples one at a time, and also that the various operations involved in freeing an overflow page didn't necessarily all need to be done together, all of which is true if you don't care about write-ahead logging. Amit Kapila, with slight changes by me.	2017-02-27 22:34:21 +05:30
Robert Haas	85c11324ca	Rename user-facing tools with "xlog" in the name to say "wal". This means pg_receivexlog because pg_receivewal, pg_resetxlog becomes pg_resetwal, and pg_xlogdump becomes pg_waldump.	2017-02-09 16:23:46 -05:00
Heikki Linnakangas	181bdb90ba	Fix typos in comments. Backpatch to all supported versions, where applicable, to make backpatching of future fixes go more smoothly. Josh Soref Discussion: https://www.postgresql.org/message-id/CACZqfqCf+5qRztLPgmmosr-B0Ye4srWzzw_mo4c_8_B_mtjmJQ@mail.gmail.com	2017-02-06 11:33:58 +02:00
Andrew Dunstan	f1169ab501	Don't count background workers against a user's connection limit. Doing so doesn't seem to be within the purpose of the per user connection limits, and has particularly unfortunate effects in conjunction with parallel queries. Backpatch to 9.6 where parallel queries were introduced. David Rowley, reviewed by Robert Haas and Albe Laurenz.	2017-02-01 18:02:43 -05:00
Peter Eisentraut	665d1fad99	Logical replication - Add PUBLICATION catalogs and DDL - Add SUBSCRIPTION catalog and DDL - Define logical replication protocol and output plugin - Add logical replication workers From: Petr Jelinek <petr@2ndquadrant.com> Reviewed-by: Steve Singer <steve@ssinger.info> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Erik Rijkers <er@xs4all.nl> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>	2017-01-20 09:04:49 -05:00
Peter Eisentraut	352a24a1f9	Generate fmgr prototypes automatically Gen_fmgrtab.pl creates a new file fmgrprotos.h, which contains prototypes for all functions registered in pg_proc.h. This avoids having to manually maintain these prototypes across a random variety of header files. It also automatically enforces a correct function signature, and since there are warnings about missing prototypes, it will detect functions that are defined but not registered in pg_proc.h (or otherwise used). Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>	2017-01-17 14:06:07 -05:00
Robert Haas	0fad355bec	Update obsolete comments in lwlock.h. The typical size of an LWLock is now 16 bytes even on 64-bit platforms, and the size of slock_t is now irrelevant. But pg_atomic_uint32 can (perhaps surprisingly) still be larger than 4 bytes, so there's still some marginal point to allowing LWLOCK_MINIMAL_SIZE == 64. Commit `008608b9d5` made the changes that led to the need for these updates.	2017-01-04 12:03:40 -05:00
Bruce Momjian	1d25779284	Update copyright via script for 2017	2017-01-03 13:48:53 -05:00
Robert Haas	3b790d256f	Fix corner-case bug in WaitEventSetWaitBlock on Windows. If we do not reset the FD_READ event, WaitForMultipleObjects won't return it again again unless we've meanwhile read from the socket, which is generally true but not guaranteed. WaitEventSetWaitBlock itself may fail to return the event to the caller if the latch is also set, and even if we changed that, the caller isn't obliged to handle all returned events at once. On non-Windows systems, the socket-read event is purely level-triggered, so this issue does not exist. To fix, make Windows reset the event when needed. This bug was introduced by `98a64d0bd7`, and causes hangs when trying to use the pldebugger extension. Patch by Amit Kapial. Reported and tested by Ashutosh Sharma, who also provided some analysis. Further analysis by Michael Paquier.	2016-12-21 11:01:48 -05:00

1 2 3 4 5 ...

1074 commits