postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-02-25 02:44:39 -05:00

Author	SHA1	Message	Date
Tom Lane	1afc1fe9c7	Back-patch some minor bug fixes in GUC code. In 9.4, fix a 9.4.1 regression that allowed multiple entries for a PGC_POSTMASTER variable to cause bogus complaints in the postmaster log. (The issue here was that commit `bf007a27ac` unintentionally reverted `3e3f65973a`, which suppressed any duplicate entries within ParseConfigFp. Back-patch the reimplementation just made in HEAD, which makes use of an "ignore" field to prevent application of superseded items.) Add missed failure check in AlterSystemSetConfigFile(). We don't really expect ParseConfigFp() to fail, but that's not an excuse for not checking. In both 9.3 and 9.4, remove mistaken assignment to ConfigFileLineno that caused line counting after an include_dir directive to be completely wrong.	2015-06-28 18:38:06 -04:00
Kevin Grittner	524e1e4031	Add opaque declaration of HTAB to tqual.h. Commit `b89e151054` added the ResolveCminCmaxDuringDecoding declaration to tqual.h, which uses an HTAB parameter, without declaring HTAB. It accidentally fails to fail to build with current sources because a declaration happens to be included, directly or indirectly, in all source files that currently use tqual.h before tqual.h is first included, but we shouldn't count on that. Since an opaque declaration is enough here, just use that, as was done in snapmgr.h. Backpatch to 9.4, where the HTAB reference was added to tqual.h.	2015-06-27 09:55:08 -05:00
Tom Lane	e118555cf9	Fix the logic for putting relations into the relcache init file. Commit `f3b5565dd4` was a couple of bricks shy of a load; specifically, it missed putting pg_trigger_tgrelid_tgname_index into the relcache init file, because that index is not used by any syscache. However, we have historically nailed that index into cache for performance reasons. The upshot was that load_relcache_init_file always decided that the init file was busted and silently ignored it, resulting in a significant hit to backend startup speed. To fix, reinstantiate RelationIdIsInInitFile() as a wrapper around RelationSupportsSysCache(), which can know about additional relations that should be in the init file despite being unknown to syscache.c. Also install some guards against future mistakes of this type: make write_relcache_init_file Assert that all nailed relations get written to the init file, and make load_relcache_init_file emit a WARNING if it takes the "wrong number of nailed relations" exit path. Now that we remove the init files during postmaster startup, that case should never occur in the field, even if we are starting a minor-version update that added or removed rels from the nailed set. So the warning shouldn't ever be seen by end users, but it will show up in the regression tests if somebody breaks this logic. Back-patch to all supported branches, like the previous commit.	2015-06-25 14:39:05 -04:00
Kevin Grittner	e9eebbaebf	Fix typo in comment. Backpatch to 9.4 to minimize possible conflicts.	2015-06-10 17:04:06 -05:00
Tom Lane	7c055f3ec3	Stamp 9.4.4.	2015-06-09 15:29:38 -04:00
Tom Lane	be25a08a91	Use a safer method for determining whether relcache init file is stale. When we invalidate the relcache entry for a system catalog or index, we must also delete the relcache "init file" if the init file contains a copy of that rel's entry. The old way of doing this relied on a specially maintained list of the OIDs of relations present in the init file: we made the list either when reading the file in, or when writing the file out. The problem is that when writing the file out, we included only rels present in our local relcache, which might have already suffered some deletions due to relcache inval events. In such cases we correctly decided not to overwrite the real init file with incomplete data --- but we still used the incomplete initFileRelationIds list for the rest of the current session. This could result in wrong decisions about whether the session's own actions require deletion of the init file, potentially allowing an init file created by some other concurrent session to be left around even though it's been made stale. Since we don't support changing the schema of a system catalog at runtime, the only likely scenario in which this would cause a problem in the field involves a "vacuum full" on a catalog concurrently with other activity, and even then it's far from easy to provoke. Remarkably, this has been broken since 2002 (in commit `7863404417`), but we had never seen a reproducible test case until recently. If it did happen in the field, the symptoms would probably involve unexpected "cache lookup failed" errors to begin with, then "could not open file" failures after the next checkpoint, as all accesses to the affected catalog stopped working. Recovery would require manually removing the stale "pg_internal.init" file. To fix, get rid of the initFileRelationIds list, and instead consult syscache.c's list of relations used in catalog caches to decide whether a relation is included in the init file. This should be a tad more efficient anyway, since we're replacing linear search of a list with ~100 entries with a binary search. It's a bit ugly that the init file contents are now so directly tied to the catalog caches, but in practice that won't make much difference. Back-patch to all supported branches.	2015-06-07 15:32:09 -04:00
Tom Lane	f0a8515c44	Fix planner's cost estimation for SEMI/ANTI joins with inner indexscans. When the inner side of a nestloop SEMI or ANTI join is an indexscan that uses all the join clauses as indexquals, it can be presumed that both matched and unmatched outer rows will be processed very quickly: for matched rows, we'll stop after fetching one row from the indexscan, while for unmatched rows we'll have an indexscan that finds no matching index entries, which should also be quick. The planner already knew about this, but it was nonetheless charging for at least one full run of the inner indexscan, as a consequence of concerns about the behavior of materialized inner scans --- but those concerns don't apply in the fast case. If the inner side has low cardinality (many matching rows) this could make an indexscan plan look far more expensive than it actually is. To fix, rearrange the work in initial_cost_nestloop/final_cost_nestloop so that we don't add the inner scan cost until we've inspected the indexquals, and then we can add either the full-run cost or just the first tuple's cost as appropriate. Experimentation with this fix uncovered another problem: add_path and friends were coded to disregard cheap startup cost when considering parameterized paths. That's usually okay (and desirable, because it thins the path herd faster); but in this fast case for SEMI/ANTI joins, it could result in throwing away the desired plain indexscan path in favor of a bitmap scan path before we ever get to the join costing logic. In the many-matching-rows cases of interest here, a bitmap scan will do a lot more work than required, so this is a problem. To fix, add a per-relation flag consider_param_startup that works like the existing consider_startup flag, but applies to parameterized paths, and set it for relations that are the inside of a SEMI or ANTI join. To make this patch reasonably safe to back-patch, care has been taken to avoid changing the planner's behavior except in the very narrow case of SEMI/ANTI joins with inner indexscans. There are places in compare_path_costs_fuzzily and add_path_precheck that are not terribly consistent with the new approach, but changing them will affect planner decisions at the margins in other cases, so we'll leave that for a HEAD-only fix. Back-patch to 9.3; before that, the consider_startup flag didn't exist, meaning that the second aspect of the patch would be too invasive. Per a complaint from Peter Holzer and analysis by Tomas Vondra.	2015-06-03 11:58:47 -04:00
Tom Lane	de17fe43fa	Stamp 9.4.3.	2015-06-01 15:05:57 -04:00
Tom Lane	a3ae3db438	Fix fsync-at-startup code to not treat errors as fatal. Commit `2ce439f337` introduced a rather serious regression, namely that if its scan of the data directory came across any un-fsync-able files, it would fail and thereby prevent database startup. Worse yet, symlinks to such files also caused the problem, which meant that crash restart was guaranteed to fail on certain common installations such as older Debian. After discussion, we agreed that (1) failure to start is worse than any consequence of not fsync'ing is likely to be, therefore treat all errors in this code as nonfatal; (2) we should not chase symlinks other than those that are expected to exist, namely pg_xlog/ and tablespace links under pg_tblspc/. The latter restriction avoids possibly fsync'ing a much larger part of the filesystem than intended, if the user has left random symlinks hanging about in the data directory. This commit takes care of that and also does some code beautification, mainly moving the relevant code into fd.c, which seems a much better place for it than xlog.c, and making sure that the conditional compilation for the pre_sync_fname pass has something to do with whether pg_flush_data works. I also relocated the call site in xlog.c down a few lines; it seems a bit silly to be doing this before ValidateXLOGDirectoryStructure(). The similar logic in initdb.c ought to be made to match this, but that change is noncritical and will be dealt with separately. Back-patch to all active branches, like the prior commit. Abhijit Menon-Sen and Tom Lane	2015-05-28 17:33:03 -04:00
Andrew Dunstan	9b74f32cdb	Unpack jbvBinary objects passed to pushJsonbValue pushJsonbValue was accepting jbvBinary objects passed as WJB_ELEM or WJB_VALUE data. While this succeeded, when those objects were later encountered in attempting to convert the result to Jsonb, errors occurred. With this change we ghuarantee that a JSonbValue constructed from calls to pushJsonbValue does not contain any jbvBinary objects. This cures a problem observed with jsonb_delete. This means callers of pushJsonbValue no longer need to perform this unpacking themselves. A subsequent patch will perform some cleanup in that area. The error was not triggered by any 9.4 code, but this is a publicly visible routine, and so the error could be exercised by third party code, therefore backpatch to 9.4. Bug report from Peter Geoghegan, fix by me.	2015-05-22 10:31:29 -04:00
Tom Lane	2eb2fcd56b	Revert error-throwing wrappers for the printf family of functions. This reverts commit `16304a0134`, except for its changes in src/port/snprintf.c; as well as commit `cac18a76bb` which is no longer needed. Fujii Masao reported that the previous commit caused failures in psql on OS X, since if one exits the pager program early while viewing a query result, psql sees an EPIPE error from fprintf --- and the wrapper function thought that was reason to panic. (It's a bit surprising that the same does not happen on Linux.) Further discussion among the security list concluded that the risk of other such failures was far too great, and that the one-size-fits-all approach to error handling embodied in the previous patch is unlikely to be workable. This leaves us again exposed to the possibility of the type of failure envisioned in CVE-2015-3166. However, that failure mode is strictly hypothetical at this point: there is no concrete reason to believe that an attacker could trigger information disclosure through the supposed mechanism. In the first place, the attack surface is fairly limited, since so much of what the backend does with format strings goes through stringinfo.c or psprintf(), and those already had adequate defenses. In the second place, even granting that an unprivileged attacker could control the occurrence of ENOMEM with some precision, it's a stretch to believe that he could induce it just where the target buffer contains some valuable information. So we concluded that the risk of non-hypothetical problems induced by the patch greatly outweighs the security risks. We will therefore revert, and instead undertake closer analysis to identify specific calls that may need hardening, rather than attempt a universal solution. We have kept the portion of the previous patch that improved snprintf.c's handling of errors when it calls the platform's sprintf(). That seems to be an unalloyed improvement. Security: CVE-2015-3166	2015-05-19 18:16:19 -04:00
Tom Lane	7aeba23ee2	Stamp 9.4.2.	2015-05-18 14:29:04 -04:00
Noah Misch	2e3bd06654	Add error-throwing wrappers for the printf family of functions. All known standard library implementations of these functions can fail with ENOMEM. A caller neglecting to check for failure would experience missing output, information exposure, or a crash. Check return values within wrappers and code, currently just snprintf.c, that bypasses the wrappers. The wrappers do not return after an error, so their callers need not check. Back-patch to 9.0 (all supported versions). Popular free software standard library implementations do take pains to bypass malloc() in simple cases, but they risk ENOMEM for floating point numbers, positional arguments, large field widths, and large precisions. No specification demands such caution, so this commit regards every call to a printf family function as a potential threat. Injecting the wrappers implicitly is a compromise between patch scope and design goals. I would prefer to edit each call site to name a wrapper explicitly. libpq and the ECPG libraries would, ideally, convey errors to the caller rather than abort(). All that would be painfully invasive for a back-patched security fix, hence this compromise. Security: CVE-2015-3166	2015-05-18 10:02:35 -04:00
Noah Misch	f7c4fe7d95	Permit use of vsprintf() in PostgreSQL code. The next commit needs it. Back-patch to 9.0 (all supported versions).	2015-05-18 10:02:35 -04:00
Robert Haas	3ecab37d97	Teach autovacuum about multixact member wraparound. The logic introduced in commit `b69bf30b9b` and repaired in commits `669c7d20e6` and `7be47c56af` helps to ensure that we don't overwrite old multixact member information while it is still needed, but a user who creates many large multixacts can still exhaust the member space (and thus start getting errors) while autovacuum stands idly by. To fix this, progressively ramp down the effective value (but not the actual contents) of autovacuum_multixact_freeze_max_age as member space utilization increases. This makes autovacuum more aggressive and also reduces the threshold for a manual VACUUM to perform a full-table scan. This patch leaves unsolved the problem of ensuring that emergency autovacuums are triggered even when autovacuum=off. We'll need to fix that via a separate patch. Thomas Munro and Robert Haas	2015-05-08 12:53:30 -04:00
Robert Haas	d8ac77ab17	Recursively fsync() the data directory after a crash. Otherwise, if there's another crash, some writes from after the first crash might make it to disk while writes from before the crash fail to make it to disk. This could lead to data corruption. Back-patch to all supported versions. Abhijit Menon-Sen, reviewed by Andres Freund and slightly revised by me.	2015-05-04 14:19:32 -04:00
Heikki Linnakangas	d72792d021	Don't archive bogus recycled or preallocated files after timeline switch. After a timeline switch, we would leave behind recycled WAL segments that are in the future, but on the old timeline. After promotion, and after they become old enough to be recycled again, we would notice that they don't have a .ready or .done file, create a .ready file for them, and archive them. That's bogus, because the files contain garbage, recycled from an older timeline (or prealloced as zeros). We shouldn't archive such files. This could happen when we're following a timeline switch during replay, or when we switch to new timeline at end-of-recovery. To fix, whenever we switch to a new timeline, scan the data directory for WAL segments on the old timeline, but with a higher segment number, and remove them. Those don't belong to our timeline history, and are most likely bogus recycled or preallocated files. They could also be valid files that we streamed from the primary ahead of time, but in any case, they're not needed to recover to the new timeline.	2015-04-13 17:22:21 +03:00
Tatsuo Ishii	3740bb8347	Make SyncRepWakeQueue to a static function It is only used in src/backend/replication/syncrep.c. Back-patch to all supported branches except 9.1 which declares the function as static.	2015-03-26 10:38:11 +09:00
Tom Lane	c415c13b7e	Build src/port/dirmod.c only on Windows. Since commit `ba7c5975ad`, port/dirmod.c has contained only Windows-specific functions. Most platforms don't seem to mind uselessly building an empty file, but OS X for one issues warnings. Hence, treat dirmod.c as a Windows-specific file selected by configure rather than one that's always built. We can revert this change if dirmod.c ever gains any non-Windows functionality again. Back-patch to 9.4 where the mentioned commit appeared.	2015-03-14 14:08:45 -04:00
Tom Lane	a271c9260f	Remove code to match IPv4 pg_hba.conf entries to IPv4-in-IPv6 addresses. In investigating yesterday's crash report from Hugo Osvaldo Barrera, I only looked back as far as commit `f3aec2c7f5` where the breakage occurred (which is why I thought the IPv4-in-IPv6 business was undocumented). But actually the logic dates back to commit `3c9bb8886d` and was simply broken by erroneous refactoring in the later commit. A bit of archives excavation shows that we added the whole business in response to a report that some 2003-era Linux kernels would report IPv4 connections as having IPv4-in-IPv6 addresses. The fact that we've had no complaints since 9.0 seems to be sufficient confirmation that no modern kernels do that, so let's just rip it all out rather than trying to fix it. Do this in the back branches too, thus essentially deciding that our effective behavior since 9.0 is correct. If there are any platforms on which the kernel reports IPv4-in-IPv6 addresses as such, yesterday's fix would have made for a subtle and potentially security-sensitive change in the effective meaning of IPv4 pg_hba.conf entries, which does not seem like a good thing to do in minor releases. So let's let the post-9.0 behavior stand, and change the documentation to match it. In passing, I failed to resist the temptation to wordsmith the description of pg_hba.conf IPv4 and IPv6 address entries a bit. A lot of this text hasn't been touched since we were IPv4-only.	2015-02-17 12:49:18 -05:00
Heikki Linnakangas	56a23a83fd	Fix broken #ifdef for __sparcv8 Rob Rowan. Backpatch to all supported versions, like the patch that added the broken #ifdef.	2015-02-13 23:56:57 +02:00
Tom Lane	ccf655c446	Minor cleanup/code review for "indirect toast" stuff. Fix some issues I noticed while fooling with an extension to allow an additional kind of toast pointer. Much of this is just comment improvement, but there are a couple of actual bugs, which might or might not be reachable today depending on what can happen during logical decoding. An example is that toast_flatten_tuple() failed to cover the possibility of an indirection pointer in its input. Back-patch to 9.4 just in case that is reachable now. In HEAD, also correct some really minor issues with recent compression reorganization, such as dangerously underparenthesized macros.	2015-02-09 12:30:55 -05:00
Tom Lane	d0f83327d3	Stamp 9.4.1.	2015-02-02 15:42:55 -05:00
Heikki Linnakangas	57ec87c6b8	Be more careful to not lose sync in the FE/BE protocol. If any error occurred while we were in the middle of reading a protocol message from the client, we could lose sync, and incorrectly try to interpret a part of another message as a new protocol message. That will usually lead to an "invalid frontend message" error that terminates the connection. However, this is a security issue because an attacker might be able to deliberately cause an error, inject a Query message in what's supposed to be just user data, and have the server execute it. We were quite careful to not have CHECK_FOR_INTERRUPTS() calls or other operations that could ereport(ERROR) in the middle of processing a message, but a query cancel interrupt or statement timeout could nevertheless cause it to happen. Also, the V2 fastpath and COPY handling were not so careful. It's very difficult to recover in the V2 COPY protocol, so we will just terminate the connection on error. In practice, that's what happened previously anyway, as we lost protocol sync. To fix, add a new variable in pqcomm.c, PqCommReadingMsg, that is set whenever we're in the middle of reading a message. When it's set, we cannot safely ERROR out and continue running, because we might've read only part of a message. PqCommReadingMsg acts somewhat similarly to critical sections in that if an error occurs while it's set, the error handler will force the connection to be terminated, as if the error was FATAL. It's not implemented by promoting ERROR to FATAL in elog.c, like ERROR is promoted to PANIC in critical sections, because we want to be able to use PG_TRY/CATCH to recover and regain protocol sync. pq_getmessage() takes advantage of that to prevent an OOM error from terminating the connection. To prevent unnecessary connection terminations, add a holdoff mechanism similar to HOLD/RESUME_INTERRUPTS() that can be used hold off query cancel interrupts, but still allow die interrupts. The rules on which interrupts are processed when are now a bit more complicated, so refactor ProcessInterrupts() and the calls to it in signal handlers so that the signal handlers always call it if ImmediateInterruptOK is set, and ProcessInterrupts() can decide to not do anything if the other conditions are not met. Reported by Emil Lenngren. Patch reviewed by Noah Misch and Andres Freund. Backpatch to all supported versions. Security: CVE-2015-0244	2015-02-02 17:09:46 +02:00
Tom Lane	b6a164e5cb	Fix assorted oversights in range selectivity estimation. calc_rangesel() failed outright when comparing range variables to empty constant ranges with < or >=, as a result of missing cases in a switch. It also produced a bogus estimate for > comparison to an empty range. On top of that, the >= and > cases were mislabeled throughout. For nonempty constant ranges, they managed to produce the right answers anyway as a result of counterbalancing typos. Also, default_range_selectivity() omitted cases for elem <@ range, range &< range, and range &> range, so that rather dubious defaults were applied for these operators. In passing, rearrange the code in rangesel() so that the elem <@ range case is handled in a less opaque fashion. Report and patch by Emre Hasegeli, some additional work by me	2015-01-30 12:31:08 -05:00
Heikki Linnakangas	dc40ca6963	Fix query-duration memory leak with GIN rescans. The requiredEntries / additionalEntries arrays were not freed in freeScanKeys() like other per-key stuff. It's not obvious, but startScanKey() was only ever called after the keys have been initialized with ginNewScanKey(). That's why it doesn't need to worry about freeing existing arrays. The ginIsNewKey() test in gingetbitmap was never true, because ginrescan free's the existing keys, and it's not OK to call gingetbitmap twice in a row without calling ginrescan in between. To make that clear, remove the unnecessary ginIsNewKey(). And just to be extra sure that nothing funny happens if there is an existing key after all, call freeScanKeys() to free it if it exists. This makes the code more straightforward. (I'm seeing other similar leaks in testing a query that rescans an GIN index scan, but that's a different issue. This just fixes the obvious leak with those two arrays.) Backpatch to 9.4, where GIN fast scan was added.	2015-01-30 17:59:17 +01:00
Tom Lane	d25192892d	Improve performance of EXPLAIN with large range tables. As of 9.3, ruleutils.c goes to some lengths to ensure that table and column aliases used in its output are unique. Of course this takes more time than was required before, which in itself isn't fatal. However, EXPLAIN was set up so that recalculation of the unique aliases was repeated for each subexpression printed in a plan. That results in O(N^2) time and memory consumption for large plan trees, which did not happen in older branches. Fortunately, the expensive work is the same across a whole plan tree, so there is no need to repeat it; we can do most of the initialization just once per query and re-use it for each subexpression. This buys back most (not all) of the performance loss since 9.2. We need an extra ExplainState field to hold the precalculated deparse context. That's no problem in HEAD, but in the back branches, expanding sizeof(ExplainState) seems risky because third-party extensions might have local variables of that struct type. So, in 9.4 and 9.3, introduce an auxiliary struct to keep sizeof(ExplainState) the same. We should refactor the APIs to avoid such local variables in future, but that's material for a separate HEAD-only commit. Per gripe from Alexey Bashtanov. Back-patch to 9.3 where the issue was introduced.	2015-01-15 13:18:16 -05:00
Tom Lane	733728ff37	Fix libpq's behavior when /etc/passwd isn't readable. Some users run their applications in chroot environments that lack an /etc/passwd file. This means that the current UID's user name and home directory are not obtainable. libpq used to be all right with that, so long as the database role name to use was specified explicitly. But commit `a4c8f14364` broke such cases by causing any failure of pg_fe_getauthname() to be treated as a hard error. In any case it did little to advance its nominal goal of causing errors in pg_fe_getauthname() to be reported better. So revert that and instead put some real error-reporting code in place. This requires changes to the APIs of pg_fe_getauthname() and pqGetpwuid(), since the latter had departed from the POSIX-specified API of getpwuid_r() in a way that made it impossible to distinguish actual lookup errors from "no such user". To allow such failures to be reported, while not failing if the caller supplies a role name, add a second call of pg_fe_getauthname() in connectOptions2(). This is a tad ugly, and could perhaps be avoided with some refactoring of PQsetdbLogin(), but I'll leave that idea for later. (Note that the complained-of misbehavior only occurs in PQsetdbLogin, not when using the PQconnect functions, because in the latter we will never bother to call pg_fe_getauthname() if the user gives a role name.) In passing also clean up the Windows-side usage of GetUserName(): the recommended buffer size is 257 bytes, the passed buffer length should be the buffer size not buffer size less 1, and any error is reported by GetLastError() not errno. Per report from Christoph Berg. Back-patch to 9.4 where the chroot failure case was introduced. The generally poor reporting of errors here is of very long standing, of course, but given the lack of field complaints about it we won't risk changing these APIs further back (even though they're theoretically internal to libpq).	2015-01-11 12:35:47 -05:00
Noah Misch	83fb1ca5cf	On Darwin, detect and report a multithreaded postmaster. Darwin --enable-nls builds use a substitute setlocale() that may start a thread. Buildfarm member orangutan experienced BackendList corruption on account of different postmaster threads executing signal handlers simultaneously. Furthermore, a multithreaded postmaster risks undefined behavior from sigprocmask() and fork(). Emit LOG messages about the problem and its workaround. Back-patch to 9.0 (all supported versions).	2015-01-07 22:36:35 -05:00
Andres Freund	70e36adb0d	Add pg_string_endswith as the start of a string helper library in src/common. Backpatch to 9.3 where src/common was introduce, because a bugfix that needs to be backpatched, requires the function. Earlier branches will have to duplicate the code.	2015-01-03 20:54:13 +01:00
Heikki Linnakangas	4e241f7cdf	Revert the GinMaxItemSize calculation so that we fit 3 tuples per page. Commit `36a35c55` changed the divisor from 3 to 6, for no apparent reason. Reducing GinMaxItemSize like that created a dump/reload hazard: loading a 9.3 database to 9.4 might fail with "index row size XXX exceeds maximum 1352 for index ..." error. Revert the change. While we're at it, make the calculation slightly more accurate. It used to divide the available space on page by three, then subtract sizeof(ItemIdData), and finally round down. That's not totally accurate; the item pointers for the three items are packed tight right after the page header, but there is alignment padding after the item pointers. Change the calculation to reflect that, like BTMaxItemSize does. I tested this with different block sizes on systems with 4- and 8-byte alignment, and the value after the final MAXALIGN_DOWN was the same with both methods on all configurations. So this does not make any difference currently, but let's be tidy. Also add a comment explaining what the macro does. This fixes bug #12292 reported by Robert Thaler. Backpatch to 9.4, where the bug was introduced.	2014-12-30 14:53:03 +02:00
Tom Lane	383d224a02	Fix off-by-one loop count in MapArrayTypeName, and get rid of static array. MapArrayTypeName would copy up to NAMEDATALEN-1 bytes of the base type name, which of course is wrong: after prepending '_' there is only room for NAMEDATALEN-2 bytes. Aside from being the wrong result, this case would lead to overrunning the statically allocated work buffer. This would be a security bug if the function were ever used outside bootstrap mode, but it isn't, at least not in any currently supported branches. Aside from fixing the off-by-one loop logic, this patch gets rid of the static work buffer by having MapArrayTypeName pstrdup its result; the sole caller was already doing that, so this just requires moving the pstrdup call. This saves a few bytes but mainly it makes the API a lot cleaner. Back-patch on the off chance that there is some third-party code using MapArrayTypeName with less-secure input. Pushing pstrdup into the function should not cause any serious problems for such hypothetical code; at worst there might be a short term memory leak. Per Coverity scanning.	2014-12-16 15:35:36 -05:00
Tom Lane	8ca336f4ac	Stamp 9.4.0.	2014-12-15 20:07:34 -05:00
Peter Eisentraut	3e2dc9703a	Move PG_AUTOCONF_FILENAME definition Since this is not something that a user should change, pg_config_manual.h was an inappropriate place for it. In initdb.c, remove the use of the macro, because utils/guc.h can't be included by non-backend code. But we hardcode all the other configuration file names there, so this isn't a disaster.	2014-12-03 19:58:13 -05:00
Andres Freund	10f1f93d8a	Don't skip SQL backends in logical decoding for visibility computation. The logical decoding patchset introduced PROC_IN_LOGICAL_DECODING flag PGXACT flag, that allows such backends to be skipped when computing the xmin horizon/snapshots. That's fine and sensible for walsenders streaming out logical changes, but not at all fine for SQL backends doing logical decoding. If the latter set that flag any change they have performed outside of logical decoding will not be regarded as visible - which e.g. can lead to that change being vacuumed away. Note that not setting the flag for SQL backends isn't particularly bothersome - the SQL backend doesn't do streaming, so it only runs for a limited amount of time. Per buildfarm member 'tick' and Alvaro. Backpatch to 9.4, where logical decoding was introduced.	2014-12-02 23:52:44 +01:00
Andrew Dunstan	f82132f4a1	Fix hstore_to_json_loose's detection of valid JSON number values. We expose a function IsValidJsonNumber that internally calls the lexer for json numbers. That allows us to use the same test everywhere, instead of inventing a broken test for hstore conversions. The new function is also used in datum_to_json, replacing the code that is now moved to the new function. Backpatch to 9.3 where hstore_to_json_loose was introduced.	2014-12-01 11:40:30 -05:00
Tom Lane	d5bea1fbcc	Stamp 9.4rc1.	2014-11-17 15:54:40 -05:00
Heikki Linnakangas	8fc23a9ed0	Fix race condition between hot standby and restoring a full-page image. There was a window in RestoreBackupBlock where a page would be zeroed out, but not yet locked. If a backend pinned and locked the page in that window, it saw the zeroed page instead of the old page or new page contents, which could lead to missing rows in a result set, or errors. To fix, replace RBM_ZERO with RBM_ZERO_AND_LOCK, which atomically pins, zeroes, and locks the page, if it's not in the buffer cache already. In stable branches, the old RBM_ZERO constant is renamed to RBM_DO_NOT_USE, to avoid breaking any 3rd party extensions that might use RBM_ZERO. More importantly, this avoids renumbering the other enum values, which would cause even bigger confusion in extensions that use ReadBufferExtended, but haven't been recompiled. Backpatch to all supported versions; this has been racy since hot standby was introduced.	2014-11-13 20:02:09 +02:00
Tom Lane	40b85a3ab1	Explicitly support the case that a plancache's raw_parse_tree is NULL. This only happens if a client issues a Parse message with an empty query string, which is a bit odd; but since it is explicitly called out as legal by our FE/BE protocol spec, we'd probably better continue to allow it. Fix by adding tests everywhere that the raw_parse_tree field is passed to functions that don't or shouldn't accept NULL. Also make it clear in the relevant comments that NULL is an expected case. This reverts commits `a73c9dbab0` and `2e9650cbcf`, which fixed specific crash symptoms by hacking things at what now seems to be the wrong end, ie the callee functions. Making the callees allow NULL is superficially more robust, but it's not always true that there is a defensible thing for the callee to do in such cases. The caller has more context and is better able to decide what the empty-query case ought to do. Per followup discussion of bug #11335. Back-patch to 9.2. The code before that is sufficiently different that it would require development of a separate patch, which doesn't seem worthwhile for what is believed to be an essentially cosmetic change.	2014-11-12 15:59:06 -05:00
Tom Lane	f449873623	Ensure that RowExprs and whole-row Vars produce the expected column names. At one time it wasn't terribly important what column names were associated with the fields of a composite Datum, but since the introduction of operations like row_to_json(), it's important that looking up the rowtype ID embedded in the Datum returns the column names that users would expect. That did not work terribly well before this patch: you could get the column names of the underlying table, or column aliases from any level of the query, depending on minor details of the plan tree. You could even get totally empty field names, which is disastrous for cases like row_to_json(). To fix this for whole-row Vars, look to the RTE referenced by the Var, and make sure its column aliases are applied to the rowtype associated with the result Datums. This is a tad scary because we might have to return a transient RECORD type even though the Var is declared as having some named rowtype. In principle it should be all right because the record type will still be physically compatible with the named rowtype; but I had to weaken one Assert in ExecEvalConvertRowtype, and there might be third-party code containing similar assumptions. Similarly, RowExprs have to be willing to override the column names coming from a named composite result type and produce a RECORD when the column aliases visible at the site of the RowExpr differ from the underlying table's column names. In passing, revert the decision made in commit `398f70ec07` to add an alias-list argument to ExecTypeFromExprList: better to provide that functionality in a separate function. This also reverts most of the code changes in `d685814835`, which we don't need because we're no longer depending on the tupdesc found in the child plan node's result slot to be blessed. Back-patch to 9.4, but not earlier, since this solution changes the results in some cases that users might not have realized were buggy. We'll apply a more restricted form of this patch in older branches.	2014-11-10 15:21:14 -05:00
Tom Lane	7f4ece03d6	Test IsInTransactionChain, not IsTransactionBlock, in vac_update_relstats. As noted by Noah Misch, my initial cut at fixing bug #11638 didn't cover all cases where ANALYZE might be invoked in an unsafe context. We need to test the result of IsInTransactionChain not IsTransactionBlock; which is notationally a pain because IsInTransactionChain requires an isTopLevel flag, which would have to be passed down through several levels of callers. I chose to pass in_outer_xact (ie, the result of IsInTransactionChain) rather than isTopLevel per se, as that seemed marginally more apropos for the intermediate functions to know about.	2014-10-30 13:04:13 -04:00
Robert Haas	1c49dae165	"Pin", rather than "keep", dynamic shared memory mappings and segments. Nobody seemed concerned about this naming when it originally went in, but there's a pending patch that implements the opposite of dsm_keep_mapping, and the term "unkeep" was judged unpalatable. "unpin" has existing precedent in the PostgreSQL code base, and the English language, so use this terminology instead. Per discussion, back-patch to 9.4.	2014-10-30 11:44:22 -04:00
Andres Freund	5607e996f4	Flush unlogged table's buffers when copying or moving databases. CREATE DATABASE and ALTER DATABASE .. SET TABLESPACE copy the source database directory on the filesystem level. To ensure the on disk state is consistent they block out users of the affected database and force a checkpoint to flush out all data to disk. Unfortunately, up to now, that checkpoint didn't flush out dirty buffers from unlogged relations. That bug means there could be leftover dirty buffers in either the template database, or the database in its old location. Leading to problems when accessing relations in an inconsistent state; and to possible problems during shutdown in the SET TABLESPACE case because buffers belonging files that don't exist anymore are flushed. This was reported in bug #10675 by Maxim Boguk. Fix by Pavan Deolasee, modified somewhat by me. Reviewed by MauMau and Fujii Masao. Backpatch to 9.1 where unlogged tables were introduced.	2014-10-20 23:45:20 +02:00
Andrew Dunstan	6e0a053a96	Correct volatility markings of a few json functions. json_agg and json_object_agg and their associated transition functions should have been marked as stable rather than immutable, as they call IO functions indirectly. Changing this probably isn't going to make much difference, as you can't use an aggregate function in an index expression, but we should be correct nevertheless. json_object, on the other hand, should be marked immutable rather than stable, as it does not call IO functions. As discussed on -hackers, this change is being made without bumping the catalog version, as we don't want to do that at this stage of the cycle, and the changes are very unlikely to affect anyone.	2014-10-20 14:55:35 -04:00
Tom Lane	b45f048ed8	Declare mkdtemp() only if we're providing it. Follow our usual style of providing an "extern" for a standard library function only when we're also providing the implementation. This avoids issues when the system headers declare the function slightly differently than we do, as noted by Caleb Welton. We might have to go to the extent of probing to see if the system headers declare the function, but let's not do that until it's demonstrated to be necessary. Oversight in commit `9e6b1bf258`. Back-patch to all supported branches, as that was.	2014-10-17 22:55:23 -04:00
Tom Lane	4b3b44b141	Support timezone abbreviations that sometimes change. Up to now, PG has assumed that any given timezone abbreviation (such as "EDT") represents a constant GMT offset in the usage of any particular region; we had a way to configure what that offset was, but not for it to be changeable over time. But, as with most things horological, this view of the world is too simplistic: there are numerous regions that have at one time or another switched to a different GMT offset but kept using the same timezone abbreviation. Almost the entire Russian Federation did that a few years ago, and later this month they're going to do it again. And there are similar examples all over the world. To cope with this, invent the notion of a "dynamic timezone abbreviation", which is one that is referenced to a particular underlying timezone (as defined in the IANA timezone database) and means whatever it currently means in that zone. For zones that use or have used daylight-savings time, the standard and DST abbreviations continue to have the property that you can specify standard or DST time and get that time offset whether or not DST was theoretically in effect at the time. However, the abbreviations mean what they meant at the time in question (or most recently before that time) rather than being absolutely fixed. The standard abbreviation-list files have been changed to use this behavior for abbreviations that have actually varied in meaning since 1970. The old simple-numeric definitions are kept for abbreviations that have not changed, since they are a bit faster to resolve. While this is clearly a new feature, it seems necessary to back-patch it into all active branches, because otherwise use of Russian zone abbreviations is going to become even more problematic than it already was. This change supersedes the changes in commit `513d06ded` et al to modify the fixed meanings of the Russian abbreviations; since we've not shipped that yet, this will avoid an undesirably incompatible (not to mention incorrect) change in behavior for timestamps between 2011 and 2014. This patch makes some cosmetic changes in ecpglib to keep its usage of datetime lookup tables as similar as possible to the backend code, but doesn't do anything about the increasingly obsolete set of timezone abbreviation definitions that are hard-wired into ecpglib. Whatever we do about that will likely not be appropriate material for back-patching. Also, a potential free() of a garbage pointer after an out-of-memory failure in ecpglib has been fixed. This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that caused it to produce unexpected results near a timezone transition, if both the "before" and "after" states are marked as standard time. We'd only ever thought about or tested transitions between standard and DST time, but that's not what's happening when a zone simply redefines their base GMT offset. In passing, update the SGML documentation to refer to the Olson/zoneinfo/ zic timezone database as the "IANA" database, since it's now being maintained under the auspices of IANA.	2014-10-16 15:22:13 -04:00
Tom Lane	9bb6b7c5ed	Print planning time only in EXPLAIN ANALYZE, not plain EXPLAIN. We've gotten enough push-back on that change to make it clear that it wasn't an especially good idea to do it like that. Revert plain EXPLAIN to its previous behavior, but keep the extra output in EXPLAIN ANALYZE. Per discussion. Internally, I set this up as a separate flag ExplainState.summary that controls printing of planning time and execution time. For now it's just copied from the ANALYZE option, but we could consider exposing it to users.	2014-10-15 18:50:16 -04:00
Tom Lane	abc1a8e509	Stamp 9.4beta3.	2014-10-06 14:32:17 -04:00
Tom Lane	07afbca2e7	Fix some more problems with nested append relations. As of commit `a87c72915` (which later got backpatched as far as 9.1), we're explicitly supporting the notion that append relations can be nested; this can occur when UNION ALL constructs are nested, or when a UNION ALL contains a table with inheritance children. Bug #11457 from Nelson Page, as well as an earlier report from Elvis Pranskevichus, showed that there were still nasty bugs associated with such cases: in particular the EquivalenceClass mechanism could try to generate "join" clauses connecting an appendrel child to some grandparent appendrel, which would result in assertion failures or bogus plans. Upon investigation I concluded that all current callers of find_childrel_appendrelinfo() need to be fixed to explicitly consider multiple levels of parent appendrels. The most complex fix was in processing of "broken" EquivalenceClasses, which are ECs for which we have been unable to generate all the derived equality clauses we would like to because of missing cross-type equality operators in the underlying btree operator family. That code path is more or less entirely untested by the regression tests to date, because no standard opfamilies have such holes in them. So I wrote a new regression test script to try to exercise it a bit, which turned out to be quite a worthwhile activity as it exposed existing bugs in all supported branches. The present patch is essentially the same as far back as 9.2, which is where parameterized paths were introduced. In 9.0 and 9.1, we only need to back-patch a small fragment of commit `5b7b5518d`, which fixes failure to propagate out the original WHERE clauses when a broken EC contains constant members. (The regression test case results show that these older branches are noticeably stupider than 9.2+ in terms of the quality of the plans generated; but we don't really care about plan quality in such cases, only that the plan not be outright wrong. A more invasive fix in the older branches would not be a good idea anyway from a plan-stability standpoint.)	2014-10-01 19:31:18 -04:00
Heikki Linnakangas	fc0acf4387	Remove num_xloginsert_locks GUC, replace with a #define I left the GUC in place for the beta period, so that people could experiment with different values. No-one's come up with any data that a different value would be better under some circumstances, so rather than try to document to users what the GUC, let's just hard-code the current value, 8.	2014-10-01 16:40:03 +03:00

1 2 3 4 5 ...

6485 commits