postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-04-23 07:07:22 -04:00

Author	SHA1	Message	Date
Heikki Linnakangas	cf3fff6326	Initialize the new bgwriterLatch field properly. Peter Geoghegan	2012-01-27 18:25:32 +02:00
Heikki Linnakangas	6d90eaaa89	Make bgwriter sleep longer when it has no work to do, to save electricity. To make it wake up promptly when activity starts again, backends nudge it by setting a latch in MarkBufferDirty(). The latch is kept set while bgwriter is active, so there is very little overhead from that when the system is busy. It is only armed before going into longer sleep. Peter Geoghegan, with some changes by me.	2012-01-26 18:39:13 +02:00
Robert Haas	467ff207f5	Add missing #include, to suppress compiler warning.	2012-01-26 10:16:26 -05:00
Magnus Hagander	61cb8c5abb	Add deadlock counter to pg_stat_database Adds a counter that tracks number of deadlocks that occurred in each database to pg_stat_database. Magnus Hagander, reviewed by Jaime Casanova	2012-01-26 15:58:19 +01:00
Robert Haas	0e549697d1	Classify DROP operations by whether or not they are user-initiated. This doesn't do anything useful just yet, but is intended as supporting infrastructure for allowing sepgsql to sensibly check DROP permissions. KaiGai Kohei and Robert Haas	2012-01-26 09:30:27 -05:00
Magnus Hagander	bc3347484a	Track temporary file count and size in pg_stat_database Add counters for number and size of temporary files used for spill-to-disk queries for each database to the pg_stat_database view. Tomas Vondra, review by Magnus Hagander	2012-01-26 14:41:19 +01:00
Simon Riggs	c172b7b02e	Resolve timing issue with logging locks for Hot Standby. We log AccessExclusiveLocks for replay onto standby nodes, but because of timing issues on ProcArray it is possible to log a lock that is still held by a just committed transaction that is very soon to be removed. To avoid any timing issue we avoid applying locks made by transactions with InvalidXid. Simon Riggs, bug report Tom Lane, diagnosis Pavan Deolasee	2012-01-23 23:37:32 +00:00
Heikki Linnakangas	326b922e8b	Fix corner case in cleanup of transactions using SSI. When the only remaining active transactions are READ ONLY, we do a "partial cleanup" of committed transactions because certain types of conflicts aren't possible anymore. For committed r/w transactions, we release the SIREAD locks but keep the SERIALIZABLEXACT. However, for committed r/o transactions, we can go further and release the SERIALIZABLEXACT too. The problem was with the latter case: we were returning the SERIALIZABLEXACT to the free list without removing it from the finished list. The only real change in the patch is the SHMQueueDelete line, but I also reworked some of the surrounding code to make it obvious that r/o and r/w transactions are handled differently -- the existing code felt a bit too clever. Dan Ports	2012-01-18 17:57:33 +02:00
Robert Haas	33aaa139e6	Make the number of CLOG buffers adaptive, based on shared_buffers. Previously, this was hardcoded: we always had 8. Performance testing shows that isn't enough, especially on big SMP systems, so we allow it to scale up as high as 32 when there's adequate memory. On the flip side, when shared_buffers is very small, drop the number of CLOG buffers down to as little as 4, so that we can start the postmaster even when very little shared memory is available. Per extensive discussion with Simon Riggs, Tom Lane, and others on pgsql-hackers.	2012-01-06 14:32:18 -05:00
Robert Haas	7e4911b2ae	Fix variable confusion in BufferSync(). As noted by Heikki Linnakangas, the previous coding confused the "flags" variable with the "mask" variable. The affect of this appears to be that unlogged buffers would get written out at every checkpoint rather than only at shutdown time. Although that's arguably an acceptable failure mode, I'm back-patching this change, since it seems like a poor idea to rely on this happening to work.	2012-01-06 08:35:48 -05:00
Bruce Momjian	e126958c2e	Update copyright notices for year 2012.	2012-01-01 18:01:58 -05:00
Peter Eisentraut	d383c23f6f	Remove support for on_exit() All supported platforms support the C89 standard function atexit() (SunOS 4 probably being the last one not to), and supporting both makes the code clumsy.	2011-12-27 20:57:59 +02:00
Tom Lane	d0024cd188	Avoid crashing when we have problems unlinking files post-commit. smgrdounlink takes care to not throw an ERROR if it fails to unlink something, but that caution was rendered useless by commit `3396000684`, which put an smgrexists call in front of it; smgrexists does throw error if anything looks funny, such as getting a permissions error from trying to open the file. If that happens post-commit, you get a PANIC, and what's worse the same logic appears in the WAL replay code, so the database even fails to restart. Restore the intended behavior by removing the smgrexists call --- it isn't accomplishing anything that we can't do better by adjusting mdunlink's ideas of whether it ought to warn about ENOENT or not. Per report from Joseph Shraibman of unrecoverable crash after trying to drop a table whose FSM fork had somehow gotten chmod'd to 000 permissions. Backpatch to 8.4, where the bogus coding was introduced.	2011-12-20 15:00:36 -05:00
Robert Haas	0d76b60db4	Various micro-optimizations for GetSnapshopData(). Heikki Linnakangas had the idea of rearranging GetSnapshotData to avoid checking for sub-XIDs when no top-level XID is present. This patch does that plus further a bit of further, related rearrangement. Benchmarking show a significant improvement on unlogged tables at higher concurrency levels, and mostly indifferent result on permanent tables (which are presumably bottlenecked elsewhere). Most of the benefit seems to come from using the new NormalTransactionIdPrecedes() macro rather than the function call TransactionIdPrecedes().	2011-12-16 21:48:47 -05:00
Alvaro Herrera	9d3b502443	Improve logging of autovacuum I/O activity This adds some I/O stats to the logging of autovacuum (when the operation takes long enough that log_autovacuum_min_duration causes it to be logged), so that it is easier to tune. Notably, it adds buffer I/O counts (hits, misses, dirtied) and read and write rate. Authors: Greg Smith and Noah Misch	2011-11-25 16:34:32 -03:00
Robert Haas	ed0b409d22	Move "hot" members of PGPROC into a separate PGXACT array. This speeds up snapshot-taking and reduces ProcArrayLock contention. Also, the PGPROC (and PGXACT) structures used by two-phase commit are now allocated as part of the main array, rather than in a separate array, and we keep ProcArray sorted in pointer order. These changes are intended to minimize the number of cache lines that must be pulled in to take a snapshot, and testing shows a substantial increase in performance on both read and write workloads at high concurrencies. Pavan Deolasee, Heikki Linnakangas, Robert Haas	2011-11-25 08:02:10 -05:00
Tom Lane	40d35036bb	Avoid floating-point underflow while tracking buffer allocation rate. When the system is idle for awhile after activity, the "smoothed_alloc" state variable in BgBufferSync converges slowly to zero. With standard IEEE float arithmetic this results in several iterations with denormalized values, which causes kernel traps and annoying log messages on some poorly-designed platforms. There's no real need to track such small values of smoothed_alloc, so we can prevent the kernel traps by forcing it to zero as soon as it's too small to be interesting for our purposes. This issue is purely cosmetic, since the iterations don't happen fast enough for the kernel traps to pose any meaningful performance problem, but still it seems worth shutting up the log messages. The kernel log messages were previously reported by a number of people, but kudos to Greg Matthews for tracking down exactly where they were coming from.	2011-11-19 00:35:29 -05:00
Robert Haas	71b2b657c0	Revert removal of trace_userlocks, because userlocks aren't gone. This reverts commit `0180bd6180`. contrib/userlock is gone, but user-level locking still exists, and is exposed via the pg_advisory* family of functions.	2011-11-10 17:54:27 -05:00
Simon Riggs	86e3364899	Derive oldestActiveXid at correct time for Hot Standby. There was a timing window between when oldestActiveXid was derived and when it should have been derived that only shows itself under heavy load. Move code around to ensure correct timing of derivation. No change to StartupSUBTRANS() code, which is where this failed. Bug report by Chris Redekop	2011-11-02 08:54:56 +00:00
Simon Riggs	10b7c686e5	Start Hot Standby faster when initial snapshot is incomplete. If the initial snapshot had overflowed then we can start whenever the latest snapshot is empty, not overflowed or as we did already, start when the xmin on primary was higher than xmax of our starting snapshot, which proves we have full snapshot data. Bug report by Chris Redekop	2011-11-02 08:47:43 +00:00
Robert Haas	c2891b46a4	Initialize myProcLocks queues just once, at postmaster startup. In assert-enabled builds, we assert during the shutdown sequence that the queues have been properly emptied, and during process startup that we are inheriting empty queues. In non-assert enabled builds, we just save a few cycles.	2011-11-01 22:44:54 -04:00
Simon Riggs	806a2aee37	Split work of bgwriter between 2 processes: bgwriter and checkpointer. bgwriter is now a much less important process, responsible for page cleaning duties only. checkpointer is now responsible for checkpoints and so has a key role in shutdown. Later patches will correct doc references to the now old idea that bgwriter performs checkpoints. Has beneficial effect on performance at high write rates, but mainly refactoring to more easily allow changes for power reduction by simplifying previously tortuous code around required to allow page cleaning and checkpointing to time slice in the same process. Patch by me, Review by Dickson Guedes	2011-11-01 17:14:47 +00:00
Robert Haas	53f1ca59b5	Allow hint bits to be set sooner for temporary and unlogged tables. We need not wait until the commit record is durably on disk, because in the event of a crash the page we're updating with hint bits will be gone anyway. Per off-list report from Heikki Linnakangas, this can significantly degrade the performance of unlogged tables; I was able to show a 2x speedup from this patch on a pgbench run with scale factor 15. In practice, this will mostly help small, heavily updated tables, because on larger tables you're unlikely to run into the same row again before the commit record makes it out to disk.	2011-10-28 17:08:09 -04:00
Heikki Linnakangas	cbf65509bb	Fix the number of lwlocks needed by the "fast path" lock patch. It needs one lock per backend or auxiliary process - the need for a lock for each aux processes was not accounted for in NumLWLocks(). No-one noticed, because the three locks needed for the three aux processes fit into the few extra lwlocks we allocate for 3rd party modules that don't call RequestAddinLWLocks() (NUM_USER_DEFINED_LWLOCKS, 4 by default).	2011-10-27 22:39:58 +03:00
Tom Lane	bb446b689b	Support synchronization of snapshots through an export/import procedure. A transaction can export a snapshot with pg_export_snapshot(), and then others can import it with SET TRANSACTION SNAPSHOT. The data does not leave the server so there are not security issues. A snapshot can only be imported while the exporting transaction is still running, and there are some other restrictions. I'm not totally convinced that we've covered all the bases for SSI (true serializable) mode, but it works fine for lesser isolation modes. Joachim Wieland, reviewed by Marko Tiikkaja, and rather heavily modified by Tom Lane	2011-10-22 18:23:30 -04:00
Tom Lane	b4a0223d00	Simplify and improve ProcessStandbyHSFeedbackMessage logic. There's no need to clamp the standby's xmin to be greater than GetOldestXmin's result; if there were any such need this logic would be hopelessly inadequate anyway, because it fails to account for within-database versus cluster-wide values of GetOldestXmin. So get rid of that, and just rely on sanity-checking that the xmin is not wrapped around relative to the nextXid counter. Also, don't reset the walsender's xmin if the current feedback xmin is indeed out of range; that just creates more problems than we already had. Lastly, don't bother to take the ProcArrayLock; there's no need to do that to set xmin. Also improve the comments about this in GetOldestXmin itself.	2011-10-20 19:43:31 -04:00
Bruce Momjian	0180bd6180	Remove all "traces" of trace_userlocks, because userlocks were removed in PG 8.2.	2011-10-13 19:59:57 -04:00
Robert Haas	e76bcaba9c	Repair breakage in VirtualXactLock. I broke this in commit `84e3712677`. Report and fix by Fujii Masao.	2011-10-11 07:39:09 -04:00
Tom Lane	57eb009092	Allow snapshot references to still work during transaction abort. In REPEATABLE READ (nee SERIALIZABLE) mode, an attempt to do GetTransactionSnapshot() between AbortTransaction and CleanupTransaction failed, because GetTransactionSnapshot would recompute the transaction snapshot (which is already wrong, given the isolation mode) and then re-register it in the TopTransactionResourceOwner, leading to an Assert because the TopTransactionResourceOwner should be empty of resources after AbortTransaction. This is the root cause of bug #6218 from Yamamoto Takashi. While changing plancache.c to avoid requesting a snapshot when handling a ROLLBACK masks the problem, I think this is really a snapmgr.c bug: it's lower-level than the resource manager mechanism and should not be shutting itself down before we unwind resource manager resources. However, just postponing the release of the transaction snapshot until cleanup time didn't work because of the circular dependency with TopTransactionResourceOwner. Fix by managing the internal reference to that snapshot manually instead of depending on TopTransactionResourceOwner. This saves a few cycles as well as making the module layering more straightforward. predicate.c's dependencies on TopTransactionResourceOwner go away too. I think this is a longstanding bug, but there's no evidence that it's more than a latent bug, so it doesn't seem worth any risk of back-patching.	2011-09-26 22:25:28 -04:00
Robert Haas	0c8eda6258	Memory barrier support for PostgreSQL. This is not actually used anywhere yet, but it gets the basic infrastructure in place. It is fairly likely that there are bugs, and support for some important platforms may be missing, so we'll need to refine this as we go along.	2011-09-23 17:52:43 -04:00
Peter Eisentraut	1b81c2fe6e	Remove many -Wcast-qual warnings This addresses only those cases that are easy to fix by adding or moving a const qualifier or removing an unnecessary cast. There are many more complicated cases remaining.	2011-09-11 21:54:32 +03:00
Tom Lane	a7801b62f2	Move Timestamp/Interval typedefs and basic macros into datatype/timestamp.h. As per my recent proposal, this refactors things so that these typedefs and macros are available in a header that can be included in frontend-ish code. I also changed various headers that were undesirably including utils/timestamp.h to include datatype/timestamp.h instead. Unsurprisingly, this showed that half the system was getting utils/timestamp.h by way of xlog.h. No actual code changes here, just header refactoring.	2011-09-09 13:23:41 -04:00
Tom Lane	1609797c25	Clean up the #include mess a little. walsender.h should depend on xlog.h, not vice versa. (Actually, the inclusion was circular until a couple hours ago, which was even sillier; but Bruce broke it in the expedient rather than logically correct direction.) Because of that poor decision, plus blind application of pgrminclude, we had a situation where half the system was depending on xlog.h to include such unrelated stuff as array.h and guc.h. Clean up the header inclusion, and manually revert a lot of what pgrminclude had done so things build again. This episode reinforces my feeling that pgrminclude should not be run without adult supervision. Inclusion changes in header files in particular need to be reviewed with great care. More generally, it'd be good if we had a clearer notion of module layering to dictate which headers can sanely include which others ... but that's a big task for another day.	2011-09-04 01:13:16 -04:00
Bruce Momjian	6416a82a62	Remove unnecessary #include references, per pgrminclude script.	2011-09-01 10:04:27 -04:00
Robert Haas	c01c25fbe5	Improve spinlock performance for HP-UX, ia64, non-gcc. At least on this architecture, it's very important to spin on a non-atomic instruction and only retry the atomic once it appears that it will succeed. To fix this, split TAS() into two macros: TAS(), for trying to grab the lock the first time, and TAS_SPIN(), for spinning until we get it. TAS_SPIN() defaults to same as TAS(), but we can override it when we know there's a better way. It's likely that some of the other cases in s_lock.h require similar treatment, but this is the only one we've got conclusive evidence for at present.	2011-08-29 10:05:48 -04:00
Bruce Momjian	f261deb4b4	Add missing includes after pgrminclude run.	2011-08-26 18:15:14 -04:00
Robert Haas	7488936478	Typo fix.	2011-08-22 12:16:27 -04:00
Robert Haas	24bf1552f6	Remove obsolete README file. Perhaps we ought to add some other kind of documentation here instead, but for now let's get rid of this woefully obsolete description of the sinval machinery.	2011-08-18 09:49:41 -04:00
Peter Eisentraut	e5475a80d2	Add "Reason code" prefix to internal SSI error messages This makes it clearer that the error message is perhaps not supposed to be understood by users, and it also makes it somewhat clearer that it was not accidentally omitted from translation. Idea from Heikki Linnakangas, except that we don't mark "Reason code" for translation at this point, because that would make the implementation too cumbersome.	2011-08-15 15:20:16 +03:00
Tom Lane	4dab3d5ae1	Change the autovacuum launcher to use WaitLatch instead of a poll loop. In pursuit of this (and with the expectation that WaitLatch will be needed in more places), convert the latch field that was already added to PGPROC for sync rep into a generic latch that is activated for all PGPROC-owning processes, and change many of the standard backend signal handlers to set that latch when a signal happens. This will allow WaitLatch callers to be wakened properly by these signals. In passing, fix a whole bunch of signal handlers that had been hacked to do things that might change errno, without adding the necessary save/restore logic for errno. Also make some minor fixes in unix_latch.c, and clean up bizarre and unsafe scheme for disowning the process's latch. Much of this has to be back-patched into 9.1. Peter Geoghegan, with additional work by Tom	2011-08-10 12:22:21 -04:00
Tom Lane	4e15a4db5e	Documentation improvement and minor code cleanups for the latch facility. Improve the documentation around weak-memory-ordering risks, and do a pass of general editorialization on the comments in the latch code. Make the Windows latch code more like the Unix latch code where feasible; in particular provide the same Assert checks in both implementations. Fix poorly-placed WaitLatch call in syncrep.c. This patch resolves, for the moment, concerns around weak-memory-ordering bugs in latch-related code: we have documented the restrictions and checked that existing calls meet them. In 9.2 I hope that we will install suitable memory barrier instructions in SetLatch/ResetLatch, so that their callers don't need to be quite so careful.	2011-08-09 15:30:45 -04:00
Robert Haas	84e3712677	Create VXID locks "lazily" in the main lock table. Instead of entering them on transaction startup, we materialize them only when someone wants to wait, which will occur only during CREATE INDEX CONCURRENTLY. In Hot Standby mode, the startup process must also be able to probe for conflicting VXID locks, but the lock need never be fully materialized, because the startup process does not use the normal lock wait mechanism. Since most VXID locks never need to touch the lock manager partition locks, this can significantly reduce blocking contention on read-heavy workloads. Patch by me. Review by Jeff Davis.	2011-08-04 12:38:33 -04:00
Tom Lane	ac36e6f71f	Move CheckRecoveryConflictDeadlock() call to a safer place. This kluge was inserted in a spot apparently chosen at random: the lock manager's state is not yet fully set up for the wait, and in particular LockWaitCancel hasn't been armed by setting lockAwaited, so the ProcLock will not get cleaned up if the ereport is thrown. This seems to not cause any observable problem in trivial test cases, because LockReleaseAll will silently clean up the debris; but I was able to cause failures with tests involving subtransactions. Fixes breakage induced by commit `c85c941470`. Back-patch to all affected branches.	2011-08-02 15:16:29 -04:00
Tom Lane	2e53bd5517	Fix incorrect initialization of ProcGlobal->startupBufferPinWaitBufId. It was initialized in the wrong place and to the wrong value. With bad luck this could result in incorrect query-cancellation failures in hot standby sessions, should a HS backend be holding pin on buffer number 1 while trying to acquire a lock.	2011-08-02 13:23:52 -04:00
Robert Haas	85b436f7b1	Minor stylistic corrections.	2011-08-01 08:24:45 -04:00
Robert Haas	b4fbe392f8	Reduce sinval synchronization overhead. Testing shows that the overhead of acquiring and releasing SInvalReadLock and msgNumLock on high-core count boxes can waste a lot of CPU time and hurt performance. This patch adds a per-backend flag that allows us to skip all that locking in most cases. Further testing shows that this improves performance even when sinval traffic is very high. Patch by me. Review and testing by Noah Misch.	2011-07-29 16:46:13 -04:00
Peter Eisentraut	0fe8150827	Minor message style adjustment	2011-07-27 23:54:46 +03:00
Robert Haas	8e5ac74c12	Some refinement for the "fast path" lock patch. 1. In GetLockStatusData, avoid initializing instance before we've ensured that the array is large enough. Otherwise, if repalloc moves the block around, we're hosed. 2. Add the word "Relation" to the name of some identifiers, to avoid assuming that the fast-path mechanism will only ever apply to relations (though these particular parts certainly will). Some of the macros could possibly use similar treatment, but the names are getting awfully long already. 3. Add a missing word to comment in AtPrepare_Locks().	2011-07-19 12:10:15 -04:00
Peter Eisentraut	30f854537d	Change debug message from ereport to elog	2011-07-19 07:50:10 +03:00
Robert Haas	3cba8999b3	Create a "fast path" for acquiring weak relation locks. When an AccessShareLock, RowShareLock, or RowExclusiveLock is requested on an unshared database relation, and we can verify that no conflicting locks can possibly be present, record the lock in a per-backend queue, stored within the PGPROC, rather than in the primary lock table. This eliminates a great deal of contention on the lock manager LWLocks. This patch also refactors the interface between GetLockStatusData() and pg_lock_status() to be a bit more abstract, so that we don't rely so heavily on the lock manager's internal representation details. The new fast path lock structures don't have a LOCK or PROCLOCK structure to return, so we mustn't depend on that for purposes of listing outstanding locks. Review by Jeff Davis.	2011-07-18 00:49:28 -04:00

1 2 3 4 5 ...

1336 commits