postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-04-23 23:28:01 -04:00

Author	SHA1	Message	Date
Robert Haas	5cea1b0a8c	Backport fsync queue compaction logic to all supported branches. This backports commit `7f242d880b`, except for the counter in pg_stat_bgwriter. The underlying problem (namely, that a full fsync request queue causes terrible checkpoint behavior) continues to be reported in the wild, and this code seems to be safe and robust enough to risk back-porting the fix.	2012-06-26 06:47:25 -04:00
Tom Lane	248f1ac813	Fix memory leak in ARRAY(SELECT ...) subqueries. Repeated execution of an uncorrelated ARRAY_SUBLINK sub-select (which I think can only happen if the sub-select is embedded in a larger, correlated subquery) would leak memory for the duration of the query, due to not reclaiming the array generated in the previous execution. Per bug #6698 from Armando Miraglia. Diagnosis and fix idea by Heikki, patch itself by me. This has been like this all along, so back-patch to all supported versions.	2012-06-21 17:26:31 -04:00
Simon Riggs	46ceb398f0	Avoid early reuse of btree pages, causing incorrect query results. When we allowed read-only transactions to skip assigning XIDs we introduced the possibility that a fully deleted btree page could be reused. This broke the index link sequence which could then lead to indexscans silently returning fewer rows than would have been correct. The actual incidence of silent errors from this is thought to be very low because of the exact workload required and locking pre-conditions. Fix is to remove pages only if index page opaque->btpo.xact precedes RecentGlobalXmin. Noah Misch, reviewed and backpatched by Simon Riggs	2012-06-01 12:47:01 +01:00
Peter Eisentraut	d871821e47	Translation updates	2012-05-31 23:22:50 +03:00
Tom Lane	af6ee5e8d4	Ignore SECURITY DEFINER and SET attributes for a PL's call handler. It's not very sensible to set such attributes on a handler function; but if one were to do so, fmgr.c went into infinite recursion because it would call fmgr_security_definer instead of the handler function proper. There is no way for fmgr_security_definer to know that it ought to call the handler and not the original function referenced by the FmgrInfo's fn_oid, so it tries to do the latter, causing the whole process to start over again. Ordinarily such misconfiguration of a procedural language's handler could be written off as superuser error. However, because we allow non-superuser database owners to create procedural languages and the handler for such a language becomes owned by the database owner, it is possible for a database owner to crash the backend, which ideally shouldn't be possible without superuser privileges. In 9.2 and up we will adjust things so that the handler functions are always owned by superusers, but in existing branches this is a minor security fix. Problem noted by Noah Misch (after several of us had failed to detect it :-(). This is CVE-2012-2655.	2012-05-30 23:28:21 -04:00
Tom Lane	f5c5e7497b	Expand the allowed range of timezone offsets to +/-15:59:59 from Greenwich. We used to only allow offsets less than +/-13 hours, then it was +/14, then it was +/-15. That's still not good enough though, as per today's bug report from Patric Bechtel. This time I actually looked through the Olson timezone database to find the largest offsets used anywhere. The winners are Asia/Manila, at -15:56:00 until 1844, and America/Metlakatla, at +15:13:42 until 1867. So we'd better allow offsets less than +/-16 hours. Given the history, we are way overdue to have some greppable #define symbols controlling this, so make some ... and also remove an obsolete comment that didn't get fixed the last time. Back-patch to all supported branches.	2012-05-30 19:58:54 -04:00
Tom Lane	66cad2d7b4	Teach AbortOutOfAnyTransaction to clean up partially-started transactions. AbortOutOfAnyTransaction failed to do anything if the state it saw on entry corresponded to failing partway through StartTransaction. I fixed AbortCurrentTransaction to cope with that case way back in commit `60b2444cc3`, but evidently overlooked that AbortOutOfAnyTransaction should do likewise. Back-patch to all supported branches. It's not clear that this omission has any more-than-cosmetic consequences, but it's also not clear that it doesn't, so back-patching seems the least risky choice.	2012-05-28 23:57:27 -04:00
Tom Lane	35cc2be6f4	Prevent synchronized scanning when systable_beginscan chooses a heapscan. The only interesting-for-performance case wherein we force heapscan here is when we're rebuilding the relcache init file, and the only such case that is likely to be examining a catalog big enough to be syncscanned is RelationBuildTupleDesc. But the early-exit optimization in that code gets broken if we start the scan at a random place within the catalog, so that allowing syncscan is actually a big deoptimization if pg_attribute is large (at least for the normal case where the rows for core system catalogs have never been changed since initdb). Hence, prevent syncscan here. Per my testing pursuant to complaints from Jeff Frost and Greg Sabino Mullane, though neither of them seem to have actually hit this specific problem. Back-patch to 8.3, where syncscan was introduced.	2012-05-26 19:10:13 -04:00
Tom Lane	2a7c90c83b	Fix string truncation to be multibyte-aware in text_name and bpchar_name. Previously, casts to name could generate invalidly-encoded results. Also, make these functions match namein() more exactly, by consistently using palloc0() instead of ad-hoc zeroing code. Back-patch to all supported branches. Karl Schnaitter and Tom Lane	2012-05-25 17:35:14 -04:00
Tom Lane	3ce6fa5568	Ensure that seqscans check for interrupts at least once per page. If a seqscan encounters many consecutive pages containing only dead tuples, it can remain in the loop in heapgettup for a long time, and there was no CHECK_FOR_INTERRUPTS anywhere in that loop. This meant there were real-world situations where a query would be effectively uncancelable for long stretches. Add a check placed to occur once per page, which should be enough to provide reasonable response time without adding any measurable overhead. Report and patch by Merlin Moncure (though I tweaked it a bit). Back-patch to all supported branches.	2012-05-22 19:42:23 -04:00
Heikki Linnakangas	6e088424e2	Fix bug in to_tsquery(). We were using memcpy() to copy to a possibly overlapping memory region, which is a no-no. Use memmove() instead.	2012-05-16 14:39:52 +03:00
Tom Lane	de48dbade3	Fix Windows implementation of PGSemaphoreLock. The original coding failed to reset ImmediateInterruptOK before returning, which would potentially allow a subsequent query-cancel interrupt to be accepted at an unsafe point. This is a really nasty bug since it's so hard to predict the consequences, but they could be unpleasant. Also, ensure that signal handlers are serviced before this function returns, even if the semaphore is already set. This should make the behavior more like Unix. Back-patch to all supported versions.	2012-05-10 13:36:28 -04:00
Tom Lane	a45c63ea0e	Fix printing of whole-row Vars at top level of a SELECT targetlist. Normally whole-row Vars are printed as "tabname.". However, that does not work at top level of a targetlist, because per SQL standard the parser will think that the "" should result in column-by-column expansion; which is not at all what a whole-row Var implies. We used to just print the table name in such cases, which works most of the time; but it fails if the table name matches a column name available anywhere in the FROM clause. This could lead for instance to a view being interpreted differently after dump and reload. Adding parentheses doesn't fix it, but there is a reasonably simple kluge we can use instead: attach a no-op cast, so that the "*" isn't syntactically at top level anymore. This makes the printing of such whole-row Vars a lot more consistent with other Vars, and may indeed fix more cases than just the reported one; I'm suspicious that cases involving schema qualification probably didn't work properly before, either. Per bug report and fix proposal from Abbas Butt, though this patch is quite different in detail from his. Back-patch to all supported versions.	2012-04-27 19:49:40 -04:00
Tom Lane	aa0c797c90	Fix syslogger's rotation disable/re-enable logic. If it fails to open a new log file, the syslogger assumes there's something wrong with its parameters (such as log_directory), and stops attempting automatic time-based or size-based log file rotations. Sending it SIGHUP is supposed to start that up again. However, the original coding for that was really bogus, involving clobbering a couple of GUC variables and hoping that SIGHUP processing would restore them. Get rid of that technique in favor of maintaining a separate flag showing we've turned rotation off. Per report from Mark Kirkwood. Also, the syslogger will automatically attempt to create the log_directory directory if it doesn't exist, but that was only happening at startup. For consistency and ease of use, it should do the same whenever the value of log_directory is changed by SIGHUP. Back-patch to all supported branches.	2012-04-27 00:13:00 -04:00
Robert Haas	8d2ef107e6	Fix copyfuncs/equalfuncs support for ReassignOwnedStmt. Noah Misch	2012-04-18 10:46:43 -04:00
Tom Lane	8d444c90d8	Clamp indexscan filter condition cost estimate to be not less than zero. cost_index tries to estimate the per-tuple costs of evaluating filter conditions (a/k/a qpquals) by subtracting the estimated cost of the indexqual conditions from that of the baserestrictinfo conditions. This is correct so long as the indexquals list is a subset of the baserestrictinfo list. However, in the presence of derived indexable conditions it's completely wrong, leading to bogus or even negative scan cost estimates, as seen for example in bug #6579 from Istvan Endredy. In practice the problem isn't severe except in the specific case of a LIKE optimization on a functional index containing a very expensive function. A proper fix for this might change cost estimates by more than people would like for stable branches, so in the back branches let's just clamp the cost difference to be not less than zero. That will at least prevent completely insane behavior, while not changing the results normally.	2012-04-11 20:24:39 -04:00
Tom Lane	8555586707	Fix an Assert that turns out to be reachable after all. estimate_num_groups() gets unhappy with create table empty(); select * from empty except select * from empty e2; I can't see any actual use-case for such a query (and the table is illegal per SQL spec), but it seems like a good idea that it not cause an assert failure.	2012-04-09 11:59:11 -04:00
Heikki Linnakangas	cd8b1c6d48	set_stack_base() no longer needs to be called in PostgresMain. This was a thinko in previous commit. Now that stack base pointer is now set in PostmasterMain and SubPostmasterMain, it doesn't need to be set in PostgresMain anymore.	2012-04-08 19:41:59 +03:00
Heikki Linnakangas	89da5dc6db	Do stack-depth checking in all postmaster children. We used to only initialize the stack base pointer when starting up a regular backend, not in other processes. In particular, autovacuum workers can run arbitrary user code, and without stack-depth checking, infinite recursion in e.g an index expression will bring down the whole cluster. The comment about PL/Java using set_stack_base() is not yet true. As the code stands, PL/java still modifies the stack_base_ptr variable directly. However, it's been discussed in the PL/Java mailing list that it should be changed to use the function, because PL/Java is currently oblivious to the register stack used on Itanium. There's another issues with PL/Java, namely that the stack base pointer it sets is not really the base of the stack, it could be something close to the bottom of the stack. That's a separate issue that might need some further changes to this code, but that's a different story. Backpatch to all supported releases.	2012-04-08 19:09:26 +03:00
Tom Lane	3a7fdb7068	Fix misleading output from gin_desc(). XLOG_GIN_UPDATE_META_PAGE and XLOG_GIN_DELETE_LISTPAGE records were printed with a list link field labeled as "blkno", which was confusing, especially when the link was empty (InvalidBlockNumber). Print the metapage block number instead, since that's what's actually being updated. We could include the link values too as a separate field, but not clear it's worth the trouble. Back-patch to 8.4 where the dubious code was added.	2012-04-06 18:10:41 -04:00
Tom Lane	260b119c58	Fix syslogger to not lose log coherency under high load. The original coding of the syslogger had an arbitrary limit of 20 large messages concurrently in progress, after which it would just punt and dump message fragments to the output file separately. Our ambitions are a bit higher than that now, so allow the data structure to expand as necessary. Reported and patched by Andrew Dunstan; some editing by Tom	2012-04-04 15:05:30 -04:00
Tom Lane	0e74f1f6e0	Fix COPY FROM for null marker strings that correspond to invalid encoding. The COPY documentation says "COPY FROM matches the input against the null string before removing backslashes". It is therefore reasonable to presume that null markers like E'\\0' will work ... and they did, until someone put the tests in the wrong order during microoptimization-driven rewrites. Since then, we've been failing if the null marker is something that would de-escape to an invalidly-encoded string. Since null markers generally need to be something that can't appear in the data, this represents a nontrivial loss of functionality; surprising nobody noticed it earlier. Per report from Jeff Davis. Backpatch to 8.4 where this got broken.	2012-03-25 23:17:38 -04:00
Tom Lane	0a0ca1cb18	Fix planner's handling of outer PlaceHolderVars within subqueries. For some reason, in the original coding of the PlaceHolderVar mechanism I had supposed that PlaceHolderVars couldn't propagate into subqueries. That is of course entirely possible. When it happens, we need to treat an outer-level PlaceHolderVar much like an outer Var or Aggref, that is SS_replace_correlation_vars() needs to replace the PlaceHolderVar with a Param, and then when building the finished SubPlan we have to provide the PlaceHolderVar expression as an actual parameter for the SubPlan. The handling of the contained expression is a bit delicate but it can be treated exactly like an Aggref's expression. In addition to the missing logic in subselect.c, prepjointree.c was failing to search subqueries for PlaceHolderVars that need their relids adjusted during subquery pullup. It looks like everyplace else that touches PlaceHolderVars got it right, though. Per report from Mark Murawski. In 9.1 and HEAD, queries affected by this oversight would fail with "ERROR: Upper-level PlaceHolderVar found where not expected". But in 9.0 and 8.4, you'd silently get possibly-wrong answers, since the value transmitted into the subquery wouldn't go to null when it should.	2012-03-24 16:22:00 -04:00
Robert Haas	1d058a292c	Don't allow CREATE TABLE AS to put relations in pg_global. This was never intended to be allowed, and is blocked for an ordinary CREATE TABLE, but CREATE TABLE AS slipped through the cracks. This commit won't do anything to fix existing cases where this has loophole has been exploited, but it still seems prudent to lock it down going forward. Back-branch commit only, as this problem has been refactored away on the master branch. Andres Freund	2012-03-21 12:39:12 -04:00
Tom Lane	5ce0a842a6	Fix some more bugs in GIN's WAL replay logic. In commit `4016bdef8a` I fixed a bunch of ginxlog.c bugs having to do with not handling XLogReadBuffer failures correctly. However, in ginRedoUpdateMetapage and ginRedoDeleteListPages, I unaccountably thought that failure to read the metapage would be impossible and just put in an elog(PANIC) call. This is of course wrong: failure is exactly what will happen if the index got dropped (or rebuilt) between creation of the WAL record and the crash we're trying to recover from. I believe this explains Nicholas Wilson's recent report of these errors getting reached. Also, fix memory leak in forgetIncompleteSplit. This wasn't of much concern when the code was written, but in a long-running standby server page split records could be expected to accumulate indefinitely. Back-patch to 8.4 --- before that, GIN didn't have a metapage.	2012-02-26 15:12:34 -05:00
Tom Lane	4ee9db0cd9	Remove arbitrary limitation on length of common name in SSL certificates. Both libpq and the backend would truncate a common name extracted from a certificate at 32 bytes. Replace that fixed-size buffer with dynamically allocated string so that there is no hard limit. While at it, remove the code for extracting peer_dn, which we weren't using for anything; and don't bother to store peer_cn longer than we need it in libpq. This limit was not so terribly unreasonable when the code was written, because we weren't using the result for anything critical, just logging it. But now that there are options for checking the common name against the server host name (in libpq) or using it as the user's name (in the server), this could result in undesirable failures. In the worst case it even seems possible to spoof a server name or user name, if the correct name is exactly 32 bytes and the attacker can persuade a trusted CA to issue a certificate in which that string is a prefix of the certificate's common name. (To exploit this for a server name, he'd also have to send the connection astray via phony DNS data or some such.) The case that this is a realistic security threat is a bit thin, but nonetheless we'll treat it as one. Back-patch to 8.4. Older releases contain the faulty code, but it's not a security problem because the common name wasn't used for anything interesting. Reported and patched by Heikki Linnakangas Security: CVE-2012-0867	2012-02-23 15:48:21 -05:00
Tom Lane	993b3e5084	Require execute permission on the trigger function for CREATE TRIGGER. This check was overlooked when we added function execute permissions to the system years ago. For an ordinary trigger function it's not a big deal, since trigger functions execute with the permissions of the table owner, so they couldn't do anything the user issuing the CREATE TRIGGER couldn't have done anyway. However, if a trigger function is SECURITY DEFINER, that is not the case. The lack of checking would allow another user to install it on his own table and then invoke it with, essentially, forged input data; which the trigger function is unlikely to realize, so it might do something undesirable, for instance insert false entries in an audit log table. Reported by Dinesh Kumar, patch by Robert Haas Security: CVE-2012-0866	2012-02-23 15:39:14 -05:00
Peter Eisentraut	94e2495ded	Translation updates	2012-02-23 20:33:03 +02:00
Tom Lane	7b6a37f5ec	Don't clear btpo_cycleid during _bt_vacuum_one_page. When "vacuuming" a single btree page by removing LP_DEAD tuples, we are not actually within a vacuum operation, but rather in an ordinary insertion process that could well be running concurrently with a vacuum. So clearing the cycleid is incorrect, and could cause the concurrent vacuum to miss removing tuples that it needs to remove. This is a longstanding bug introduced by commit `e6284649b9` of 2006-07-25. I believe it explains Maxim Boguk's recent report of index corruption, and probably some other previously unexplained reports. In 9.0 and up this is a one-line fix; before that we need to introduce a flag to tell _bt_delitems what to do.	2012-02-21 15:03:56 -05:00
Magnus Hagander	a863963fcd	Avoid double close of file handle in syslogger on win32 This causes an exception when running under a debugger or in particular when running on a debug version of Windows. Patch from MauMau	2012-02-21 17:14:27 +01:00
Tom Lane	cbfb9ff790	Fix regex back-references that are directly quantified with . The syntax "\n", that is a backref with a * quantifier directly applied to it, has never worked correctly in Spencer's library. This has been an open bug in the Tcl bug tracker since 2005: https://sourceforge.net/tracker/index.php?func=detail&aid=1115587&group_id=10894&atid=110894 The core of the problem is in parseqatom(), which first changes "\n" to "\n+\|" and then applies repeat() to the NFA representing the backref atom. repeat() thinks that any arc leading into its "rp" argument is part of the sub-NFA to be repeated. Unfortunately, since parseqatom() already created the arc that was intended to represent the empty bypass around "\n+", this arc gets moved too, so that it now leads into the state loop created by repeat(). Thus, what was supposed to be an "empty" bypass gets turned into something that represents zero or more repetitions of the NFA representing the backref atom. In the original example, in place of ^([bc])\1$ we now have something that acts like ^([bc])(\1+\|[bc])$ At runtime, the branch involving the actual backref fails, as it's supposed to, but then the other branch succeeds anyway. We could no doubt fix this by some rearrangement of the operations in parseqatom(), but that code is plenty ugly already, and what's more the whole business of converting "x" to "x+\|" probably needs to go away to fix another problem I'll mention in a moment. Instead, this patch suppresses the *-conversion when the target is a simple backref atom, leaving the case of m == 0 to be handled at runtime. This makes the patch in regcomp.c a one-liner, at the cost of having to tweak cbrdissect() a little. In the event I went a bit further than that and rewrote cbrdissect() to check all the string-length-related conditions before it starts comparing characters. It seems a bit stupid to possibly iterate through many copies of an n-character backreference, only to fail at the end because the target string's length isn't a multiple of n --- we could have found that out before starting. The existing coding could only be a win if integer division is hugely expensive compared to character comparison, but I don't know of any modern machine where that might be true. This does not fix all the problems with quantified back-references. In particular, the code is still broken for back-references that appear within a larger expression that is quantified (so that direct insertion of the quantification limits into the BACKREF node doesn't apply). I think fixing that will take some major surgery on the NFA code, specifically introducing an explicit iteration node type instead of trying to transform iteration into concatenation of modified regexps. Back-patch to all supported branches. In HEAD, also add a regression test case for this. (It may seem a bit silly to create a regression test file for just one test case; but I'm expecting that we will soon import a whole bunch of regex regression tests from Tcl, so might as well create the infrastructure now.)	2012-02-20 00:52:54 -05:00
Tom Lane	7b4fcabb2f	Avoid problems with OID wraparound during WAL replay. Fix a longstanding thinko in replay of NEXTOID and checkpoint records: we tried to advance nextOid only if it was behind the value in the WAL record, but the comparison would draw the wrong conclusion if OID wraparound had occurred since the previous value. Better to just unconditionally assign the new value, since OID assignment shouldn't be happening during replay anyway. The consequences of a failure to update nextOid would be pretty minimal, since we have long had the code set up to obtain another OID and try again if the generated value is already in use. But in the worst case there could be significant performance glitches while such loops iterate through many already-used OIDs before finding a free one. The odds of a wraparound happening during WAL replay would be small in a crash-recovery scenario, and the length of any ensuing OID-assignment stall quite limited anyway. But neither of these statements hold true for a replication slave that follows a WAL stream for a long period; its behavior upon going live could be almost unboundedly bad. Hence it seems worth back-patching this fix into all supported branches. Already fixed in HEAD in commit `c6d76d7c82`.	2012-02-06 13:14:58 -05:00
Heikki Linnakangas	45bcc71980	Accept a non-existent value in "ALTER USER/DATABASE SET ..." command. When default_text_search_config, default_tablespace, or temp_tablespaces setting is set per-user or per-database, with an "ALTER USER/DATABASE SET ..." statement, don't throw an error if the text search configuration or tablespace does not exist. In case of text search configuration, even if it doesn't exist in the current database, it might exist in another database, where the setting is intended to have its effect. This behavior is now the same as search_path's. Tablespaces are cluster-wide, so the same argument doesn't hold for tablespaces, but there's a problem with pg_dumpall: it dumps "ALTER USER SET ..." statements before the "CREATE TABLESPACE" statements. Arguably that's pg_dumpall's fault - it should dump the statements in such an order that the tablespace is created first and then the "ALTER USER SET default_tablespace ..." statements after that - but it seems better to be consistent with search_path and default_text_search_config anyway. Besides, you could still create a dump that throws an error, by creating the tablespace, running "ALTER USER SET default_tablespace", then dropping the tablespace and running pg_dumpall on that. Backpatch to all supported versions.	2012-01-30 11:39:50 +02:00
Tom Lane	d23c38d46c	Make executor's SELECT INTO code save and restore original tuple receiver. As previously coded, the QueryDesc's dest pointer was left dangling (pointing at an already-freed receiver object) after ExecutorEnd. It's a bit astonishing that it took us this long to notice, and I'm not sure that the known problem case with SQL functions is the only one. Fix it by saving and restoring the original receiver pointer, which seems the most bulletproof way of ensuring any related bugs are also covered. Per bug #6379 from Paul Ramsey. Back-patch to 8.4 where the current handling of SELECT INTO was introduced.	2012-01-04 18:31:14 -05:00
Tom Lane	9e2b3db158	Update per-column ACLs, not only per-table ACL, when changing table owner. We forgot to modify column ACLs, so privileges were still shown as having been granted by the old owner. This meant that neither the new owner nor a superuser could revoke the now-untraceable-to-table-owner permissions. Per bug #6350 from Marc Balmer. This has been wrong since column ACLs were added, so back-patch to 8.4.	2011-12-21 18:23:30 -05:00
Tom Lane	ae25b3ba86	Avoid crashing when we have problems unlinking files post-commit. smgrdounlink takes care to not throw an ERROR if it fails to unlink something, but that caution was rendered useless by commit `3396000684`, which put an smgrexists call in front of it; smgrexists does throw error if anything looks funny, such as getting a permissions error from trying to open the file. If that happens post-commit, you get a PANIC, and what's worse the same logic appears in the WAL replay code, so the database even fails to restart. Restore the intended behavior by removing the smgrexists call --- it isn't accomplishing anything that we can't do better by adjusting mdunlink's ideas of whether it ought to warn about ENOENT or not. Per report from Joseph Shraibman of unrecoverable crash after trying to drop a table whose FSM fork had somehow gotten chmod'd to 000 permissions. Backpatch to 8.4, where the bogus coding was introduced.	2011-12-20 15:01:03 -05:00
Heikki Linnakangas	81f4e6cd27	Revert the behavior of inet/cidr functions to not unpack the arguments. I forgot to change the functions to use the PG_GETARG_INET_PP() macro, when I changed DatumGetInetP() to unpack the datum, like Datum*P macros usually do. Also, I screwed up the definition of the PG_GETARG_INET_PP() macro, and didn't notice because it wasn't used. This fixes the memory leak when sorting inet values, as reported by Jochen Erwied and debugged by Andres Freund. Backpatch to 8.3, like the previous patch that broke it.	2011-12-12 10:05:29 +02:00
Peter Eisentraut	3208c64591	Translation updates	2011-12-01 22:47:51 +02:00
Tom Lane	8a10fed415	Fix getTypeIOParam to support type record[]. Since record[] uses array_in, it needs to have its element type passed as typioparam. In HEAD and 9.1, this fix essentially reverts commit `9bc933b212`, which was a hack that is no longer needed since domains don't set their typelem anymore. Before that, adjust the logic so that only domains are excluded from being treated like arrays, rather than assuming that only base types should be included. Add a regression test to demonstrate the need for this. Per report from Maxim Boguk. Back-patch to 8.4, where type record[] was added.	2011-12-01 12:44:34 -05:00
Tom Lane	646a93fe76	Tweak previous patch to ensure edata->filename always gets initialized. On a platform that isn't supplying __FILE__, previous coding would either crash or give a stale result for the filename string. Not sure how likely that is, but the original code catered for it, so let's keep doing so.	2011-11-30 00:37:27 -05:00
Peter Eisentraut	fe0fa1f9b4	Strip file names reported in error messages in vpath builds In vpath builds, the __FILE__ macro that is used in verbose error reports contains the full absolute file name, which makes the error messages excessively verbose. So keep only the base name, thus matching the behavior of non-vpath builds.	2011-11-29 23:58:03 +02:00
Tom Lane	9b60e40cad	Fix erroneous replay of GIN_UPDATE_META_PAGE WAL records. A simple thinko in ginRedoUpdateMetapage, namely failing to increment a loop counter, led to inserting records into the last pending-list page in the wrong order (the opposite of that intended). So far as I can tell, this would not upset the code that eventually flushes pending items into the main part of the GIN index. But it did break the code that searched the pending list for matches, resulting in transient failure to find matching entries during index lookups, as illustrated in bug #6307 from Maksym Boguk. Back-patch to 8.4 where the incorrect code was introduced.	2011-11-25 13:59:34 -05:00
Tom Lane	af7b787aad	Avoid floating-point underflow while tracking buffer allocation rate. When the system is idle for awhile after activity, the "smoothed_alloc" state variable in BgBufferSync converges slowly to zero. With standard IEEE float arithmetic this results in several iterations with denormalized values, which causes kernel traps and annoying log messages on some poorly-designed platforms. There's no real need to track such small values of smoothed_alloc, so we can prevent the kernel traps by forcing it to zero as soon as it's too small to be interesting for our purposes. This issue is purely cosmetic, since the iterations don't happen fast enough for the kernel traps to pose any meaningful performance problem, but still it seems worth shutting up the log messages. The kernel log messages were previously reported by a number of people, but kudos to Greg Matthews for tracking down exactly where they were coming from.	2011-11-19 00:36:36 -05:00
Heikki Linnakangas	61f4750bf8	Make DatumGetInetP() unpack inet datums with a 1-byte header, and add a new macro, DatumGetInetPP(), that does not. This brings these macros in line with other DatumGet*P() macros. Backpatch to 8.3, where 1-byte header varlenas were introduced.	2011-11-08 22:45:13 +02:00
Tom Lane	23998fe99c	Don't assume that a tuple's header size is unchanged during toasting. This assumption can be wrong when the toaster is passed a raw on-disk tuple, because the tuple might pre-date an ALTER TABLE ADD COLUMN operation that added columns without rewriting the table. In such a case the tuple's natts value is smaller than what we expect from the tuple descriptor, and so its t_hoff value could be smaller too. In fact, the tuple might not have a null bitmap at all, and yet our current opinion of it is that it contains some trailing nulls. In such a situation, toast_insert_or_update did the wrong thing, because to save a few lines of code it would use the old t_hoff value as the offset where heap_fill_tuple should start filling data. This did not leave enough room for the new nulls bitmap, with the result that the first few bytes of data could be overwritten with null flag bits, as in a recent report from Hubert Depesz Lubaczewski. The particular case reported requires ALTER TABLE ADD COLUMN followed by CREATE TABLE AS SELECT * FROM ... or INSERT ... SELECT * FROM ..., and further requires that there be some out-of-line toasted fields in one of the tuples to be copied; else we'll not reach the troublesome code. The problem can only manifest in this form in 8.4 and later, because before commit `a77eaa6a95`, CREATE TABLE AS or INSERT/SELECT wouldn't result in raw disk tuples getting passed directly to heap_insert --- there would always have been at least a junkfilter in between, and that would reconstitute the tuple header with an up-to-date t_natts and hence t_hoff. But I'm backpatching the tuptoaster change all the way anyway, because I'm not convinced there are no older code paths that present a similar risk.	2011-11-04 23:23:24 -04:00
Tom Lane	9fc28fb5ea	Fix inline_set_returning_function() to allow multiple OUT parameters. inline_set_returning_function failed to distinguish functions returning generic RECORD (which require a column list in the RTE, as well as run-time type checking) from those with multiple OUT parameters (which do not). This prevented inlining from happening. Per complaint from Jay Levitt. Back-patch to 8.4 where this capability was introduced.	2011-11-03 17:53:33 -04:00
Tom Lane	9f19cc28a4	Revert "Stop btree indexscans upon reaching nulls in either direction." This reverts commit `d23165bbc3`. As pointed out by Naoya Anzai, we need to do more work to make that idea handle end-of-index cases, and it is looking like too much risk for a back-patch. So bug #6278 is only going to be fixed in HEAD.	2011-11-02 13:37:24 -04:00
Tom Lane	b05ce7550c	Fix race condition with toast table access from a stale syscache entry. If a tuple in a syscache contains an out-of-line toasted field, and we try to fetch that field shortly after some other transaction has committed an update or deletion of the tuple, there is a race condition: vacuum could come along and remove the toast tuples before we can fetch them. This leads to transient failures like "missing chunk number 0 for toast value NNNNN in pg_toast_2619", as seen in recent reports from Andrew Hammond and Tim Uckun. The design idea of syscache is that access to stale syscache entries should be prevented by relation-level locks, but that fails for at least two cases where toasted fields are possible: ANALYZE updates pg_statistic rows without locking out sessions that might want to plan queries on the same table, and CREATE OR REPLACE FUNCTION updates pg_proc rows without any meaningful lock at all. The least risky fix seems to be an idea that Heikki suggested when we were dealing with a related problem back in August: forcibly detoast any out-of-line fields before putting a tuple into syscache in the first place. This avoids the problem because at the time we fetch the parent tuple from the catalog, we should be holding an MVCC snapshot that will prevent removal of the toast tuples, even if the parent tuple is outdated immediately after we fetch it. (Note: I'm not convinced that this statement holds true at every instant where we could be fetching a syscache entry at all, but it does appear to hold true at the times where we could fetch an entry that could have a toasted field. We will need to be a bit wary of adding toast tables to low-level catalogs that don't have them already.) An additional benefit is that subsequent uses of the syscache entry should be faster, since they won't have to detoast the field. Back-patch to all supported versions. The problem is significantly harder to reproduce in pre-9.0 releases, because of their willingness to flush every entry in a syscache whenever the underlying catalog is vacuumed (cf CatalogCacheFlushRelation); but there is still a window for trouble.	2011-11-01 19:48:56 -04:00
Tom Lane	d23165bbc3	Stop btree indexscans upon reaching nulls in either direction. The existing scan-direction-sensitive tests were overly complex, and failed to stop the scan in cases where it's perfectly legitimate to do so. Per bug #6278 from Maksym Boguk. Back-patch to 8.3, which is as far back as the patch applies easily. Doesn't seem worth sweating over a relatively minor performance issue in 8.2 at this late date. (But note that this was a performance regression from 8.1 and before, so 8.2 is being left as an outlier.)	2011-10-31 16:40:22 -04:00
Tom Lane	04ec05eb06	Fix assorted bogosities in cash_in() and cash_out(). cash_out failed to handle multiple-byte thousands separators, as per bug #6277 from Alexander Law. In addition, cash_in didn't handle that either, nor could it handle multiple-byte positive_sign. Both routines failed to support multiple-byte mon_decimal_point, which I did not think was worth changing, but at least now they check for the possibility and fall back to using '.' rather than emitting invalid output. Also, make cash_in handle trailing negative signs, which formerly it would reject. Since cash_out generates trailing negative signs whenever the locale tells it to, this last omission represents a fail-to-reload-dumped-data bug. IMO that justifies patching this all the way back.	2011-10-29 14:31:07 -04:00

1 2 3 4 5 ...

10808 commits