postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-03-12 05:32:27 -04:00

Author	SHA1	Message	Date
Peter Eisentraut	d380d080fa	Fix some null pointer dereferences in LDAP auth code An LDAP URL without a host name such as "ldap://" or without a base DN such as "ldap://localhost" would cause a crash when reading pg_hba.conf. If no binddn is configured, an error message might end up trying to print a null pointer, which could crash on some platforms. Author: Thomas Munro <thomas.munro@enterprisedb.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-11-10 14:29:13 -05:00
Tom Lane	3c966ce653	Fix typo in ALTER SYSTEM output. The header comment written into postgresql.auto.conf by ALTER SYSTEM should match what initdb put there originally. Feike Steenbergen Discussion: https://postgr.es/m/CAK_s-G0KcKdO=0hqZkwb3s+tqZuuHwWqmF5BDsmoO9FtX75r0g@mail.gmail.com	2017-11-09 11:57:32 -05:00
Tom Lane	b89ad383da	Fix two violations of the ResourceOwnerEnlarge/Remember protocol. The point of having separate ResourceOwnerEnlargeFoo and ResourceOwnerRememberFoo functions is so that resource allocation can happen in between. Doing it in some other order is just wrong. OpenTemporaryFile() did open(), enlarge, remember, which would leak the open file if the enlarge step ran out of memory. Because fd.c has its own layer of resource-remembering, the consequences look like they'd be limited to an intratransaction FD leak, but it's still not good. IncrBufferRefCount() did enlarge, remember, incr-refcount, which would blow up if the incr-refcount step ever failed. It was safe enough when written, but since the introduction of PrivateRefCountHash, I think the assumption that no error could happen there is pretty shaky. The odds of real problems from either bug are probably small, but still, back-patch to supported branches. Thomas Munro and Tom Lane, per a comment from Andres Freund	2017-11-08 16:50:13 -05:00
Tom Lane	1a93c2536a	Fix unportable usage of <ctype.h> functions. isdigit(), isspace(), etc are likely to give surprising results if passed a signed char. We should always cast the argument to unsigned char to avoid that. Error in commit `63d6b97fd`, found by buildfarm member gaur. Back-patch to 9.3, like that commit.	2017-11-07 13:49:54 -05:00
Tom Lane	0a13f1966d	Stamp 9.6.6.	2017-11-06 17:08:55 -05:00
Tom Lane	38e825632b	Make json{b}_populate_recordset() use the right tuple descriptor. json{b}_populate_recordset() used the tuple descriptor created from the query-level AS clause without worrying about whether it matched the actual input record type. If it didn't, that would usually result in a crash, though disclosure of server memory contents seems possible as well, for a skilled attacker capable of issuing crafted SQL commands. Instead, use the query-supplied descriptor only when there is no input tuple to look at, and otherwise get a tuple descriptor based on the input tuple's own type marking. The core code will detect any type mismatch in the latter case. Michael Paquier and Tom Lane, per a report from David Rowley. Back-patch to 9.3 where this functionality was introduced. Security: CVE-2017-15098	2017-11-06 10:29:39 -05:00
Dean Rasheed	1f23d1cd21	Always require SELECT permission for ON CONFLICT DO UPDATE. The update path of an INSERT ... ON CONFLICT DO UPDATE requires SELECT permission on the columns of the arbiter index, but it failed to check for that in the case of an arbiter specified by constraint name. In addition, for a table with row level security enabled, it failed to check updated rows against the table's SELECT policies when the update path was taken (regardless of how the arbiter index was specified). Backpatch to 9.5 where ON CONFLICT DO UPDATE and RLS were introduced. Security: CVE-2017-15099	2017-11-06 09:16:24 +00:00
Noah Misch	971983f42f	Add a temp-install prerequisite to "check"-like targets not having one. Makefile.global assigns this prerequisite to every target named "check", but similar targets must mention it explicitly. Affected targets failed, tested $PATH binaries, or tested a stale temporary installation. The src/test/modules examples worked properly when called as "make -C src/test/modules/$FOO check", but "make -j" allowed the test to start before the temporary installation was in place. Back-patch to 9.5, where commit `dcae5facca` introduced the shared temp-install.	2017-11-05 18:52:38 -08:00
Peter Eisentraut	4077723cbf	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 078119e4b862f81bbb7207eff6788efc99ec4d97	2017-11-05 17:01:24 -05:00
Noah Misch	1cac62dac0	Ignore CatalogSnapshot when checking COPY FREEZE prerequisites. This restores the ability, essentially lost in commit `ffaa44cb55`, to use COPY FREEZE under REPEATABLE READ isolation. Back-patch to 9.4, like that commit. Reviewed by Tom Lane. Discussion: https://postgr.es/m/CA+TgmoahWDm-7fperBxzU9uZ99LPMUmEpSXLTw9TmrOgzwnORw@mail.gmail.com	2017-11-05 09:26:28 -08:00
Alvaro Herrera	bd8e2b3cf4	Fix BRIN summarization concurrent with extension If a process is extending a table concurrently with some BRIN summarization process, it is possible for the latter to miss pages added by the former because the number of pages is computed ahead of time. Fix by determining a fresh relation size after inserting the placeholder tuple: any process that further extends the table concurrently will update the placeholder tuple, while previous pages will be processed by the heap scan. Reported-by: Tomas Vondra Reviewed-by: Tom Lane Author: Álvaro Herrera Discussion: https://postgr.es/m/083d996a-4a8a-0e13-800a-851dd09ad8cc@2ndquadrant.com Backpatch-to: 9.5	2017-11-03 17:23:13 +01:00
Michael Meskes	6cf68e2236	Improve error message for incorrect number inputs in libecpg.	2017-11-03 12:41:23 +01:00
Michael Meskes	049dab0094	Fix float parsing in ecpg INFORMIX mode.	2017-11-02 20:51:13 +01:00
Tom Lane	a43cd427e7	Fix corner-case errors in brin_doupdate(). In some cases the BRIN code releases lock on an index page, and later re-acquires lock and tries to check that the tuple it was working on is still there. That check was a couple bricks shy of a load. It didn't consider that the page might have turned into a "revmap" page. (The samepage code path doesn't call brin_getinsertbuffer(), so it isn't protected by the checks for revmap status there.) It also didn't check whether the tuple offset was now off the end of the linepointer array. Since commit `24992c6db` the latter case is pretty common, but at least in principle it could have occurred before that. The net result is that concurrent updates of a BRIN index could fail with errors like "invalid index offnum" or "inconsistent range map". Per report from Tomas Vondra. Back-patch to 9.5, since this code is substantially the same in all versions containing BRIN. Discussion: https://postgr.es/m/10d2b9f9-f427-03b8-8ad9-6af4ecacbee9@2ndquadrant.com	2017-11-02 12:54:23 -04:00
Alvaro Herrera	08ba67d596	Revert bogus fixes of HOT-freezing bug It turns out we misdiagnosed what the real problem was. Revert the previous changes, because they may have worse consequences going forward. A better fix is forthcoming. The simplistic test case is kept, though disabled. Discussion: https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de	2017-11-02 15:51:05 +01:00
Peter Eisentraut	4ba0ffaaee	pg_basebackup: Fix comparison handling of tablespace mappings on Windows A candidate path needs to be canonicalized before being checked against the mappings, because the mappings are also canonicalized. This is especially relevant on Windows Reported-by: nb <nbedxp@gmail.com> Author: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>	2017-11-01 21:44:55 -04:00
Michael Meskes	e0ec1cbffe	Make sure ecpglib does accepts digits behind decimal point even for integers in Informix mode. Spotted and fixed by 高增琦 <pgf00a@gmail.com>	2017-11-01 13:40:50 +01:00
Robert Haas	f74f871b80	Fix problems with the "role" GUC and parallel query. Without this fix, dropping a role can sometimes result in parallel query failures in sessions that have used "SET ROLE" to assume the dropped role, even if that setting isn't active any more. Report by Pavan Deolasee. Patch by Amit Kapila, reviewed by me. Discussion: http://postgr.es/m/CABOikdOomRcZsLsLK+Z+qENM1zxyaWnAvFh3MJZzZnnKiF+REg@mail.gmail.com	2017-10-29 13:14:37 +05:30
Tom Lane	21daada10e	Dept of second thoughts: keep aliasp_item in sync with tlistitem. Commit `d5b760ecb` wasn't quite right, on second thought: if the caller didn't ask for column names then it would happily emit more Vars than if the caller did ask for column names. This is surely not a good idea. Advance the aliasp_item whether or not we're preparing a colnames list.	2017-10-27 18:16:25 -04:00
Tom Lane	7e5e8b36d4	Fix crash when columns have been added to the end of a view. expandRTE() supposed that an RTE_SUBQUERY subquery must have exactly as many non-junk tlist items as the RTE has column aliases for it. This was true at the time the code was written, and is still true so far as parse analysis is concerned --- but when the function is used during planning, the subquery might have appeared through insertion of a view that now has more columns than it did when the outer query was parsed. This results in a core dump if, for instance, we have to expand a whole-row Var that references the subquery. To avoid crashing, we can either stop expanding the RTE when we run out of aliases, or invent new aliases for the added columns. While the latter might be more useful, the former is consistent with what expandRTE() does for composite-returning functions in the RTE_FUNCTION case, so it seems like we'd better do it that way. Per bug #14876 from Samuel Horwitz. This has been busted since commit `ff1ea2173` allowed views to acquire more columns, so back-patch to all supported branches. Discussion: https://postgr.es/m/20171026184035.1471.82810@wrigleys.postgresql.org	2017-10-27 17:10:21 -04:00
Tom Lane	cf0331a54c	Rethink the dependencies recorded for FieldSelect/FieldStore nodes. On closer investigation, commits `f3ea3e3e8` et al were a few bricks shy of a load. What we need is not so much to lock down the result type of a FieldSelect, as to lock down the existence of the column it's trying to extract. Otherwise, we can break it by dropping that column. The dependency on the result type is then held indirectly through the column, and doesn't need to be recorded explicitly. Out of paranoia, I left in the code to record a dependency on the result type, but it's used only if we can't identify the pg_class OID for the column. That shouldn't ever happen right now, AFAICS, but it seems possible that in future the input node could be marked as being of type RECORD rather than some specific composite type. Likewise for FieldStore. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/22571.1509064146@sss.pgh.pa.us	2017-10-27 12:18:57 -04:00
Robert Haas	036b6bd503	Fix mistaken failure to allow parallelism in corner case. If we try to run a parallel plan in serial mode because, for example, it's going to be scanned via a cursor, but for some reason we're already in parallel mode (for example because an outer query is running in parallel), we'd incorrectly try to launch workers. Fix by adding a flag to the EState, so that we can be certain that ExecutePlan() and ExecGather()/ExecGatherMerge() will have the same idea about whether we are executing serially or in parallel. Report and fix by Amit Kapila with help from Kuntal Ghosh. A few tweaks by me. Discussion: http://postgr.es/m/CAA4eK1+_BuZrmVCeua5Eqnm4Co9DAXdM5HPAOE2J19ePbR912Q@mail.gmail.com	2017-10-27 16:12:16 +02:00
Tom Lane	37b4e0fe99	Make setrefs.c match by ressortgroupref even for plain Vars. Previously, we skipped using search_indexed_tlist_for_sortgroupref() if the tlist expression being sought in the child plan node was merely a Var. This is purely an optimization, based on the theory that search_indexed_tlist_for_var() is faster, and one copy of a Var should be as good as another. However, the GROUPING SETS patch broke the latter assumption: grouping columns containing the "same" Var can sometimes have different outputs, as shown in the test case added here. So do it the hard way whenever a ressortgroupref marking exists. (If this seems like a bottleneck, we could imagine building a tlist index data structure for ressortgroupref values, as we do for Vars. But I'll let that idea go until there's some evidence it's worthwhile.) Back-patch to 9.6. The problem also exists in 9.5 where GROUPING SETS came in, but this patch is insufficient to resolve the problem in 9.5: there is some obscure dependency on the upper-planner-pathification work that happened in 9.6. Given that this is such a weird corner case, and no end users have complained about it, it doesn't seem worth the work to develop a fix for 9.5. Patch by me, per a report from Heikki Linnakangas. (This does not fix Heikki's original complaint, just the follow-on one.) Discussion: https://postgr.es/m/aefc657e-edb2-64d5-6df1-a0828f6e9104@iki.fi	2017-10-26 12:17:40 -04:00
Andrew Dunstan	ea8480a249	Improve gendef.pl diagnostic on failure to open sym file There have been numerous buildfarm failures but the diagnostic is currently silent about the reason for failure to open the file. Let's see if we can get to the bottom of it. Backpatch to all live branches.	2017-10-26 10:10:37 -04:00
Michael Meskes	41753604b3	Fixed handling of escape character in libecpg. Patch by Tsunakawa Takayuki <tsunakawa.takay@jp.fujitsu.com>	2017-10-26 10:39:46 +02:00
Tom Lane	7dc66a2f6e	Fix libpq to not require user's home directory to exist. Some people like to run libpq-using applications in environments where there's no home directory. We've broken that scenario before (cf commits `5b4067798` and `bd58d9d88`), and commit `ba005f193` broke it again, by making it a hard error if we fail to get the home directory name while looking for ~/.pgpass. The previous precedent is that if we can't get the home directory name, we should just silently act as though the file we hoped to find there doesn't exist. Rearrange the new code to honor that. Looking around, the service-file code added by commit `41a4e4595` had the same disease. Apparently, that escaped notice because it only runs when a service name has been specified, which I guess the people who use this scenario don't do. Nonetheless, it's wrong too, so fix that case as well. Add a comment about this policy to pqGetHomeDirectory, in the probably vain hope of forestalling the same error in future. And upgrade the rather miserable commenting in parseServiceInfo, too. In passing, also back off parseServiceInfo's assumption that only ENOENT is an ignorable error from stat() when checking a service file. We would need to ignore at least ENOTDIR as well (cf `5b4067798`), and seeing that the far-better-tested code for ~/.pgpass treats all stat() failures alike, I think this code ought to as well. Per bug #14872 from Dan Watson. Back-patch the .pgpass change to v10 where `ba005f193` came in. The service-file bugs are far older, so back-patch the other changes to all supported branches. Discussion: https://postgr.es/m/20171025200457.1471.34504@wrigleys.postgresql.org	2017-10-25 19:32:24 -04:00
Andrew Dunstan	98efa5ebf0	Process variadic arguments consistently in json functions json_build_object and json_build_array and the jsonb equivalents did not correctly process explicit VARIADIC arguments. They are modified to use the new extract_variadic_args() utility function which abstracts away the details of the call method. Michael Paquier, reviewed by Tom Lane and Dmitry Dolgov. Backpatch to 9.5 for the jsonb fixes and 9.4 for the json fixes, as that's where they originated.	2017-10-25 07:48:36 -04:00
Andrew Dunstan	5c3a1bbb47	Add a utility function to extract variadic function arguments This is epecially useful in the case or "VARIADIC ANY" functions. The caller can get the artguments and types regardless of whether or not and explicit VARIADIC array argument has been used. The function also provides an option to convert arguments on type "unknown" to to "text". Michael Paquier and me, reviewed by Tom Lane. Backpatch to 9.4 in order to support the following json bug fix.	2017-10-25 07:19:59 -04:00
Tom Lane	fae550e526	Update time zone data files to tzdata release 2017c. DST law changes in Fiji, Namibia, Northern Cyprus, Sudan, Tonga, and Turks & Caicos Islands. Historical corrections for Alaska, Apia, Burma, Calcutta, Detroit, Ireland, Namibia, and Pago Pago.	2017-10-23 18:15:47 -04:00
Tom Lane	173b7a4a71	Sync our copy of the timezone library with IANA release tzcode2017c. This is a trivial update containing only cosmetic changes. The point is just to get back to being synced with an official release of tzcode, rather than some ad-hoc point in their commit history, which is where commit `47f849a3c` left it.	2017-10-23 17:54:09 -04:00
Tom Lane	285b850d51	Fix some oversights in expression dependency recording. find_expr_references() neglected to record a dependency on the result type of a FieldSelect node, allowing a DROP TYPE to break a view or rule that contains such an expression. I think we'd omitted this case intentionally, reasoning that there would always be a related dependency ensuring that the DROP would cascade to the view. But at least with nested field selection expressions, that's not true, as shown in bug #14867 from Mansur Galiev. Add the dependency, and for good measure a dependency on the node's exposed collation. Likewise add a dependency on the result type of a FieldStore. I think here the reasoning was that it'd only appear within an assignment to a field, and the dependency on the field's column would be enough ... but having seen this example, I think that's wrong for nested-composites cases. Looking at nearby code, I notice we're not recording a dependency on the exposed collation of CoerceViaIO, which seems inconsistent with our choices for related node types. Maybe that's OK but I'm feeling suspicious of this code today, so let's add that; it certainly can't hurt. This patch does not do anything to protect already-existing views, only views created after it's installed. But seeing that the issue has been there a very long time and nobody noticed till now, that's probably good enough. Back-patch to all supported branches. Discussion: https://postgr.es/m/20171023150118.1477.19174@wrigleys.postgresql.org	2017-10-23 13:57:45 -04:00
Tom Lane	b1752c3a7e	Fix typcache's failure to treat ranges as container types. Like the similar logic for arrays and records, it's necessary to examine the range's subtype to decide whether the range type can support hashing. We can omit checking the subtype for btree-defined operations, though, since range subtypes are required to have those operations. (Possibly that simplification for btree cases led us to overlook that it does not apply for hash cases.) This is only an issue if the subtype lacks hash support, which is not true of any built-in range type, but it's easy to demonstrate a problem with a range type over, eg, money: you can get a "could not identify a hash function" failure when the planner is misled into thinking that hash join or aggregation would work. This was born broken, so back-patch to all supported branches.	2017-10-20 17:12:27 -04:00
Tom Lane	2ac5988747	Fix misparsing of non-newline-terminated pg_hba.conf files. This back-patches the v10-cycle commit `1e5a5d03d` into 9.3 - 9.6. I had noticed at the time that that was fixing a bug, namely that next_token() might advance *lineptr past the line-terminating '\0', but given the lack of field complaints I too easily convinced myself that the problem was only latent. It's not, because tokenize_file() decides whether there's more on the line using "strlen(lineptr)". The bug is indeed latent on a newline-terminated line, because then the newline-stripping bit in tokenize_file() means we'll have two or more consecutive '\0's in the buffer, masking the fact that we accidentally advanced over the first one. But the last line in the file might not be null-terminated, allowing the loop to see and process garbage, as reported by Mark Jones in bug #14859. The bug doesn't exist in <= 9.2; there next_token() is reading directly from a file, and termination of the outer loop relies on an feof() test not a buffer pointer check. Probably commit `7f49a67f9` can be blamed for this bug, but I didn't track it down exactly. Commit `1e5a5d03d` does a bit more than the minimum needed to fix the bug, but I felt the rest of it was good cleanup, so applying it all. Discussion: https://postgr.es/m/20171017141814.8203.27280@wrigleys.postgresql.org	2017-10-17 12:15:08 -04:00
Tom Lane	aa1e9b3a46	Fix AggGetAggref() so it won't lie to aggregate final functions. If we merge the transition calculations for two different aggregates, it's reasonable to assume that the transition function should not care which of those Aggref structs it gets from AggGetAggref(). It is not reasonable to make the same assumption about an aggregate final function, however. Commit `804163bc2` broke this, as it will pass whichever Aggref was first associated with the transition state in both cases. This doesn't create an observable bug so far as the core system is concerned, because the only existing uses of AggGetAggref() are in ordered-set aggregates that happen to not pay attention to anything but the input properties of the Aggref; and besides that, we disabled sharing of transition calculations for OSAs yesterday. Nonetheless, if some third-party code were using AggGetAggref() in a normal aggregate, they would be entitled to call this a bug. Hence, back-patch the fix to 9.6 where the problem was introduced. In passing, improve some of the comments about transition state sharing. Discussion: https://postgr.es/m/CAB4ELO5RZhOamuT9Xsf72ozbenDLLXZKSk07FiSVsuJNZB861A@mail.gmail.com	2017-10-12 15:20:04 -04:00
Tom Lane	96cfc7e19a	Prevent sharing transition states between ordered-set aggregates. This ought to work, but the built-in OSAs are not capable of coping, because their final-functions destructively modify their transition state (specifically, the contained tuplesort object). That was fine when those functions were written, but commit `804163bc2` moved the goalposts without telling orderedsetaggs.c. We should fix the built-in OSAs to support this, but it will take a little work, especially if we don't want to sacrifice performance in the normal non-shared-state case. Given that it took a year after 9.6 release for anyone to notice this bug, we should not prioritize sharable-state over nonsharable-state performance. And a proper fix is likely to be more complicated than we'd want to back-patch, too. Therefore, let's just put in this stop-gap patch to prevent nodeAgg.c from choosing to use shared state for OSAs. We can revert it in HEAD when we get a better fix. Report from Lukas Eder, diagnosis by me, patch by David Rowley. Back-patch to 9.6 where the problem was introduced. Discussion: https://postgr.es/m/CAB4ELO5RZhOamuT9Xsf72ozbenDLLXZKSk07FiSVsuJNZB861A@mail.gmail.com	2017-10-11 22:18:01 -04:00
Andres Freund	0da46d75e3	Prevent idle in transaction session timeout from sometimes being ignored. The previous coding in ProcessInterrupts() could lead to idle_in_transaction_session_timeout being ignored, when statement_timeout occurred earlier. The problem was that ProcessInterrupts() would return before processing the transaction timeout if QueryCancelPending was set while QueryCancelHoldoffCount != 0 - which is the case when reading new commands from the client. Ergo when the idle transaction timeout would hit. Fix that by removing the early return. Alternatively the transaction timeout code could have been moved up, but that early return seems like an issue that could hit other cases too. Author: Lukas Fittl Bug: #14821 Discussion: https://www.postgresql.org/message-id/20170921010956.17345.61461%40wrigleys.postgresql.org https://www.postgresql.org/message-id/CAP53PkxQnv3OWJpyNPGJYT62uY=n1=2CF_Lpc6gVOFnc0-gazw@mail.gmail.com Backpatch: 9.6-, where idle_in_transaction_session_timeout was introduced.	2017-10-11 14:02:41 -07:00
Tom Lane	19989d8480	Doc: fix missing explanation of default object privileges. The GRANT reference page, which lists the default privileges for new objects, failed to mention that USAGE is granted by default for data types and domains. As a lesser sin, it also did not specify anything about the initial privileges for sequences, FDWs, foreign servers, or large objects. Fix that, and add a comment to acldefault() in the probably vain hope of getting people to maintain this list in future. Noted by Laurenz Albe, though I editorialized on the wording a bit. Back-patch to all supported branches, since they all have this behavior. Discussion: https://postgr.es/m/1507620895.4152.1.camel@cybertec.at	2017-10-11 16:56:40 -04:00
Tom Lane	36c687a22f	Fix low-probability loss of NOTIFY messages due to XID wraparound. Up to now async.c has used TransactionIdIsInProgress() to detect whether a notify message's source transaction is still running. However, that function has a quick-exit path that reports that XIDs before RecentXmin are no longer running. If a listening backend is doing nothing but listening, and not running any queries, there is nothing that will advance its value of RecentXmin. Once 2 billion transactions elapse, the RecentXmin check causes active transactions to be reported as not running. If they aren't committed yet according to CLOG, async.c decides they aborted and discards their messages. The timing for that is a bit tight but it can happen when multiple backends are sending notifies concurrently. The net symptom therefore is that a sufficiently-long-surviving listen-only backend starts to miss some fraction of NOTIFY traffic, but only under heavy load. The only function that updates RecentXmin is GetSnapshotData(). A brute-force fix would therefore be to take a snapshot before processing incoming notify messages. But that would add cycles, as well as contention for the ProcArrayLock. We can be smarter: having taken the snapshot, let's use that to check for running XIDs, and not call TransactionIdIsInProgress() at all. In this way we reduce the number of ProcArrayLock acquisitions from one per message to one per notify interrupt; that's the same under light load but should be a benefit under heavy load. Light testing says that this change is a wash performance-wise for normal loads. I looked around for other callers of TransactionIdIsInProgress() that might be at similar risk, and didn't find any; all of them are inside transactions that presumably have already taken a snapshot. Problem report and diagnosis by Marko Tiikkaja, patch by me. Back-patch to all supported branches, since it's been like this since 9.0. Discussion: https://postgr.es/m/20170926182935.14128.65278@wrigleys.postgresql.org	2017-10-11 14:28:33 -04:00
Tom Lane	13a8924ecf	Increase distance between flush requests during bulk file copies. copy_file() reads and writes data 64KB at a time (with default BLCKSZ), and historically has issued a pg_flush_data request after each write. This turns out to interact really badly with macOS's new APFS file system: a large file copy takes over 100X longer than it ought to on APFS, as reported by Brent Dearth. While that's arguably a macOS bug, it's not clear whether Apple will do anything about it in the near future, and in any case experimentation suggests that issuing flushes a bit less often can be helpful on other platforms too. Hence, rearrange the logic in copy_file() so that flush requests are issued once per N writes rather than every time through the loop. I set the FLUSH_DISTANCE to 32MB on macOS (any less than that still results in a noticeable speed degradation on APFS), but 1MB elsewhere. In limited testing on Linux and FreeBSD, this seems slightly faster than the previous code, and certainly no worse. It helps noticeably on macOS even with the older HFS filesystem. A simpler change would have been to just increase the size of the copy buffer without changing the loop logic, but that seems likely to trash the processor cache without really helping much. Back-patch to 9.6 where we introduced msync() as an implementation option for pg_flush_data(). The problem seems specific to APFS's mmap/msync support, so I don't think we need to go further back. Discussion: https://postgr.es/m/CADkxhTNv-j2jw2g8H57deMeAbfRgYBoLmVuXkC=YCFBXRuCOww@mail.gmail.com	2017-10-08 15:25:26 -04:00
Tom Lane	185279da3f	Fix crash when logical decoding is invoked from a PL function. The logical decoding functions do BeginInternalSubTransaction and RollbackAndReleaseCurrentSubTransaction to clean up after themselves. It turns out that AtEOSubXact_SPI has an unrecognized assumption that we always need to cancel the active SPI operation in the SPI context that surrounds the subtransaction (if there is one). That's true when the RollbackAndReleaseCurrentSubTransaction call is coming from the SPI-using function itself, but not when it's happening inside some unrelated function invoked by a SPI query. In practice the affected callers are the various PLs. To fix, record the current subtransaction ID when we begin a SPI operation, and clean up only if that ID is the subtransaction being canceled. Also, remove AtEOSubXact_SPI's assertion that it must have cleaned up the surrounding SPI context's active tuptable. That's proven wrong by the same test case. Also clarify (or, if you prefer, reinterpret) the calling conventions for _SPI_begin_call and _SPI_end_call. The memory context cleanup in the latter means that these have always had the flavor of a matched resource-management pair, but they weren't documented that way before. Per report from Ben Chobot. Back-patch to 9.4 where logical decoding came in. In principle, the SPI changes should go all the way back, since the problem dates back to commit `7ec1c5a86`. But given the lack of field complaints it seems few people are using internal subtransactions in this way. So I don't feel a need to take any risks in 9.2/9.3. Discussion: https://postgr.es/m/73FBA179-C68C-4540-9473-71E865408B15@silentmedia.com	2017-10-06 19:18:58 -04:00
Tom Lane	69e931f96e	Fix access-off-end-of-array in clog.c. Sloppy loop coding in set_status_by_pages() resulted in fetching one array element more than it should from the subxids[] array. The odds of this resulting in SIGSEGV are pretty small, but we've certainly seen that happen with similar mistakes elsewhere. While at it, we can get rid of an extra TransactionIdToPage() calculation per loop. Per report from David Binderman. Back-patch to all supported branches, since this code is quite old. Discussion: https://postgr.es/m/HE1PR0802MB2331CBA919CBFFF0C465EB429C710@HE1PR0802MB2331.eurprd08.prod.outlook.com	2017-10-06 12:20:13 -04:00
Alvaro Herrera	d441cff142	Fix traversal of half-frozen update chains When some tuple versions in an update chain are frozen due to them being older than freeze_min_age, the xmax/xmin trail can become broken. This breaks HOT (and probably other things). A subsequent VACUUM can break things in more serious ways, such as leaving orphan heap-only tuples whose root HOT redirect items were removed. This can be seen because index creation (or REINDEX) complain like ERROR: XX000: failed to find parent tuple for heap-only tuple at (0,7) in table "t" Because of relfrozenxid contraints, we cannot avoid the freezing of the early tuples, so we must cope with the results: whenever we see an Xmin of FrozenTransactionId, consider it a match for whatever the previous Xmax value was. This problem seems to have appeared in 9.3 with multixact changes, though strictly speaking it seems unrelated. Since 9.4 we have commit `37484ad2a` "Change the way we mark tuples as frozen", so the fix is simple: just compare the raw Xmin (still stored in the tuple header, since freezing merely set an infomask bit) to the Xmax. But in 9.3 we rewrite the Xmin value to FrozenTransactionId, so the original value is lost and we have nothing to compare the Xmax with. To cope with that case we need to compare the Xmin with FrozenXid, assume it's a match, and hope for the best. Sadly, since you can pg_upgrade a 9.3 instance containing half-frozen pages to newer releases, we need to keep the old check in newer versions too, which seems a bit brittle; I hope we can somehow get rid of that. I didn't optimize the new function for performance. The new coding is probably a bit slower than before, since there is a function call rather than a straight comparison, but I'd rather have it work correctly than be fast but wrong. This is a followup after `20b6552242` fixed a few related problems. Apparently, in 9.6 and up there are more ways to get into trouble, but in 9.3 - 9.5 I cannot reproduce a problem anymore with this patch, so there must be a separate bug. Reported-by: Peter Geoghegan Diagnosed-by: Peter Geoghegan, Michael Paquier, Daniel Wood, Yi Wen Wong, Álvaro Discussion: https://postgr.es/m/CAH2-Wznm4rCrhFAiwKPWTpEw2bXDtgROZK7jWWGucXeH3D1fmA@mail.gmail.com	2017-10-06 17:14:42 +02:00
Robert Haas	9d742e19da	Fix more user-visible elog() calls. Michael Paquier discovered that this could be triggered via SQL; give a nicer message instead. Patch by Michael Paquier, reviewed by Masahiko Sawada. Discussion: http://postgr.es/m/CAB7nPqQtPg+LKKtzdKN26judHcvPZ0s1gNigzOT4j8CYuuuBYg@mail.gmail.com	2017-10-05 08:03:04 -04:00
Alvaro Herrera	86076395ef	Fix coding rules violations in walreceiver.c 1. Since commit `b1a9bad9e7` we had pstrdup() inside a spinlock-protected critical section; reported by Andreas Seltenreich. Turn those into strlcpy() to stack-allocated variables instead. Backpatch to 9.6. 2. Since commit `9ed551e0a4` we had a pfree() uselessly inside a spinlock-protected critical section. Tom Lane noticed in code review. Move down. Backpatch to 9.6. 3. Since commit `64233902d2` we had GetCurrentTimestamp() (a kernel call) inside a spinlock-protected critical section. Tom Lane noticed in code review. Move it up. Backpatch to 9.2. 4. Since commit `1bb2558046` we did elog(PANIC) while holding spinlock. Tom Lane noticed in code review. Release spinlock before dying. Backpatch to 9.2. Discussion: https://postgr.es/m/87h8vhtgj2.fsf@ansel.ydns.eu	2017-10-03 14:58:25 +02:00
Tom Lane	858e9f27be	Use a longer connection timeout in pg_isready test. Buildfarm members skink and sungazer have both recently failed this test, with symptoms indicating that the default 3-second timeout isn't quite enough for those very slow systems. There's no reason to be miserly with this timeout, so boost it to 60 seconds. Back-patch to all versions containing this test. That may be overkill, because the failure has only been observed in the v10 branch, but I don't feel like having to revisit this later.	2017-10-01 12:43:47 -04:00
Alvaro Herrera	3a135bdb1b	Fix freezing of a dead HOT-updated tuple Vacuum calls page-level HOT prune to remove dead HOT tuples before doing liveness checks (HeapTupleSatisfiesVacuum) on the remaining tuples. But concurrent transaction commit/abort may turn DEAD some of the HOT tuples that survived the prune, before HeapTupleSatisfiesVacuum tests them. This happens to activate the code that decides to freeze the tuple ... which resuscitates it, duplicating data. (This is especially bad if there's any unique constraints, because those are now internally violated due to the duplicate entries, though you won't know until you try to REINDEX or dump/restore the table.) One possible fix would be to simply skip doing anything to the tuple, and hope that the next HOT prune would remove it. But there is a problem: if the tuple is older than freeze horizon, this would leave an unfrozen XID behind, and if no HOT prune happens to clean it up before the containing pg_clog segment is truncated away, it'd later cause an error when the XID is looked up. Fix the problem by having the tuple freezing routines cope with the situation: don't freeze the tuple (and keep it dead). In the cases that the XID is older than the freeze age, set the HEAP_XMAX_COMMITTED flag so that there is no need to look up the XID in pg_clog later on. An isolation test is included, authored by Michael Paquier, loosely based on Daniel Wood's original reproducer. It only tests one particular scenario, though, not all the possible ways for this problem to surface; it be good to have a more reliable way to test this more fully, but it'd require more work. In message https://postgr.es/m/20170911140103.5akxptyrwgpc25bw@alvherre.pgsql I outlined another test case (more closely matching Dan Wood's) that exposed a few more ways for the problem to occur. Backpatch all the way back to 9.3, where this problem was introduced by multixact juggling. In branches 9.3 and 9.4, this includes a backpatch of commit e5ff9fefcd50 (of 9.5 era), since the original is not correctable without matching the coding pattern in 9.5 up. Reported-by: Daniel Wood Diagnosed-by: Daniel Wood Reviewed-by: Yi Wen Wong, Michaël Paquier Discussion: https://postgr.es/m/E5711E62-8FDF-4DCA-A888-C200BF6B5742@amazon.com	2017-09-28 16:44:01 +02:00
Tom Lane	def03e4bfe	Fix behavior when converting a float infinity to numeric. float8_numeric() and float4_numeric() failed to consider the possibility that the input is an IEEE infinity. The results depended on the platform-specific behavior of sprintf(): on most platforms you'd get something like ERROR: invalid input syntax for type numeric: "inf" but at least on Windows it's possible for the conversion to succeed and deliver a finite value (typically 1), due to a nonstandard output format from sprintf and lack of syntax error checking in these functions. Since our numeric type lacks the concept of infinity, a suitable conversion is impossible; the best thing to do is throw an explicit error before letting sprintf do its thing. While at it, let's use snprintf not sprintf. Overrunning the buffer should be impossible if sprintf does what it's supposed to, but this is cheap insurance against a stack smash if it doesn't. Problem reported by Taiki Kondo. Patch by me based on fix suggestion from KaiGai Kohei. Back-patch to all supported branches. Discussion: https://postgr.es/m/12A9442FBAE80D4E8953883E0B84E088C8C7A2@BPXM01GP.gisp.nec.co.jp	2017-09-27 17:05:53 -04:00
Tom Lane	f4aac742c8	Improve wording of error message added in commit `714805010`. Per suggestions from Peter Eisentraut and David Johnston. Back-patch, like the previous commit. Discussion: https://postgr.es/m/E1dv9jI-0006oT-Fn@gemulon.postgresql.org	2017-09-26 15:25:56 -04:00
Tom Lane	12ac252f90	Fix failure-to-read-man-page in commit `899bd785c`. posix_fallocate() is not quite a drop-in replacement for fallocate(), because it is defined to return the error code as its function result, not in "errno". I (tgl) missed this because RHEL6's version seems to set errno as well. That is not the case on more modern Linuxen, though, as per buildfarm results. Aside from fixing the return-convention confusion, remove the test for ENOSYS; we expect that glibc will mask that for posix_fallocate, though it does not for fallocate. Keep the test for EINTR, because POSIX specifies that as a possible result, and buildfarm results suggest that it can happen in practice. Back-patch to 9.4, like the previous commit. Thomas Munro Discussion: https://postgr.es/m/1002664500.12301802.1471008223422.JavaMail.yahoo@mail.yahoo.com	2017-09-26 13:43:07 -04:00
Tom Lane	1750612224	Avoid SIGBUS on Linux when a DSM memory request overruns tmpfs. On Linux, shared memory segments created with shm_open() are backed by swap files created in tmpfs. If the swap file needs to be extended, but there's no tmpfs space left, you get a very unfriendly SIGBUS trap. To avoid this, force allocation of the full request size when we create the segment. This adds a few cycles, but none that we wouldn't expend later anyway, assuming the request isn't hugely bigger than the actual need. Make this code #ifdef __linux__, because (a) there's not currently a reason to think the same problem exists on other platforms, and (b) applying posix_fallocate() to an FD created by shm_open() isn't very portable anyway. Back-patch to 9.4 where the DSM code came in. Thomas Munro, per a bug report from Amul Sul Discussion: https://postgr.es/m/1002664500.12301802.1471008223422.JavaMail.yahoo@mail.yahoo.com	2017-09-25 16:09:20 -04:00
Andrew Dunstan	10aafbdbe4	Support building with Visual Studio 2017 Haribabu Kommi, reviewed by Takeshi Ideriha and Christian Ullrich Backpatch to 9.6	2017-09-25 08:08:52 -04:00
Peter Eisentraut	a1f30ecc51	Fix saving and restoring umask In two cases, we set a different umask for some piece of code and restore it afterwards. But if the contained code errors out, the umask is not restored. So add TRY/CATCH blocks to fix that.	2017-09-23 10:03:36 -04:00
Tom Lane	e25f4401da	Sync our copy of the timezone library with IANA tzcode master. This patch absorbs a few unreleased fixes in the IANA code. It corresponds to commit 2d8b944c1cec0808ac4f7a9ee1a463c28f9cd00a in https://github.com/eggert/tz. Non-cosmetic changes include: TZDEFRULESTRING is updated to match current US DST practice, rather than what it was over ten years ago. This only matters for interpretation of POSIX-style zone names (e.g., "EST5EDT"), and only if the timezone database doesn't include either an exact match for the zone name or a "posixrules" entry. The latter should not be true in any current Postgres installation, but this could possibly matter when using --with-system-tzdata. Get rid of a nonportable use of "++var" on a bool var. This is part of a larger fix that eliminates some vestigial support for consecutive leap seconds, and adds checks to the "zic" compiler that the data files do not specify that. Remove a couple of ancient compatibility hacks. The IANA crew think these are obsolete, and I tend to agree. But perhaps our buildfarm will think different. Back-patch to all supported branches, in line with our policy that all branches should be using current IANA code. Before v10, this includes application of current pgindent rules, to avoid whitespace problems in future back-patches. Discussion: https://postgr.es/m/E1dsWhf-0000pT-F9@gemulon.postgresql.org	2017-09-22 00:04:21 -04:00
Tom Lane	ea31541f56	Give a better error for duplicate entries in VACUUM/ANALYZE column list. Previously, the code didn't think about this case and would just try to analyze such a column twice. That would fail at the point of inserting the second version of the pg_statistic row, with obscure error messsages like "duplicate key value violates unique constraint" or "tuple already updated by self", depending on context and PG version. We could allow the case by ignoring duplicate column specifications, but it seems better to reject it explicitly. The bogus error messages seem like arguably a bug, so back-patch to all supported versions. Nathan Bossart, per a report from Michael Paquier, and whacked around a bit by me. Discussion: https://postgr.es/m/E061A8E3-5E3D-494D-94F0-E8A9B312BBFC@amazon.com	2017-09-21 18:13:11 -04:00
Michael Meskes	59b5a3e5c7	Fixed ECPG to correctly handle out-of-scope cursor declarations with pointers or array variables.	2017-09-18 23:07:34 +02:00
Tom Lane	86e4ebb9af	Allow rel_is_distinct_for() to look through RelabelType below OpExpr. This lets it do the right thing for, eg, varchar columns. Back-patch to 9.5 where this logic appeared. David Rowley, per report from Kim Rose Carlsen Discussion: https://postgr.es/m/VI1PR05MB17091F9A9876528055D6A827C76D0@VI1PR05MB1709.eurprd05.prod.outlook.com	2017-09-17 15:28:51 -04:00
Tom Lane	c0d21bdb86	Fix possible dangling pointer dereference in trigger.c. AfterTriggerEndQuery correctly notes that the query_stack could get repalloc'd during a trigger firing, but it nonetheless passes the address of a query_stack entry to afterTriggerInvokeEvents, so that if such a repalloc occurs, afterTriggerInvokeEvents is already working with an obsolete dangling pointer while it scans the rest of the events. Oops. The only code at risk is its "delete_ok" cleanup code, so we can prevent unsafe behavior by passing delete_ok = false instead of true. However, that could have a significant performance penalty, because the point of passing delete_ok = true is to not have to re-scan possibly a large number of dead trigger events on the next time through the loop. There's more than one way to skin that cat, though. What we can do is delete all the "chunks" in the event list except the last one, since we know all events in them must be dead. Deleting the chunks is work we'd have had to do later in AfterTriggerEndQuery anyway, and it ends up saving rescanning of just about the same events we'd have gotten rid of with delete_ok = true. In v10 and HEAD, we also have to be careful to mop up any per-table after_trig_events pointers that would become dangling. This is slightly annoying, but I don't think that normal use-cases will traverse this code path often enough for it to be a performance problem. It's pretty hard to hit this in practice because of the unlikelihood of the query_stack getting resized at just the wrong time. Nonetheless, it's definitely a live bug of ancient standing, so back-patch to all supported branches. Discussion: https://postgr.es/m/2891.1505419542@sss.pgh.pa.us	2017-09-17 14:50:01 -04:00
Robert Haas	353328ad1a	Add missing tags to GetCommandLogLevel. Otherwise, log_statement = 'ddl' causes errors if those statement types are used. Michael Paquier, reviewed by Ashutosh Sharma Discussion: http://postgr.es/m/CAB7nPqStC3HkE76Q1MnHsVd1vF1Td9zXApzYadzDMyLMRkkGrw@mail.gmail.com	2017-09-14 16:47:11 -04:00
Stephen Frost	caae416aac	Fix ordering in pg_dump of GRANTs The order in which GRANTs are output is important as GRANTs which have been GRANT'd by individuals via WITH GRANT OPTION GRANTs have to come after the GRANT which included the WITH GRANT OPTION. This happens naturally in the backend during normal operation as we only change existing ACLs in-place, only add new ACLs to the end, and when removing an ACL we remove any which depend on it also. Also, adjust the comments in acl.h to make this clear. Unfortunately, the updates to pg_dump to handle initial privileges involved pulling apart ACLs and then combining them back together and could end up putting them back together in an invalid order, leading to dumps which wouldn't restore. Fix this by adjusting the queries used by pg_dump to ensure that the ACLs are rebuilt in the same order in which they were originally. Back-patch to 9.6 where the changes for initial privileges were done.	2017-09-13 20:02:27 -04:00
Michael Meskes	407e660781	Changed order of statements and added an additiona MSVC safeguard to make ecpg thread test cases work on Windows.	2017-09-14 01:17:15 +02:00
Michael Meskes	839ee1811d	Make setlocale in ECPG test cases thread aware on Windows. Fix threaded test cases on Windows not to crash in setlocale() which can be global or local to a thread on Windows. Author: Christian Ullrich	2017-09-14 01:17:03 +02:00
Tom Lane	64e2b29bde	Fix RecursiveCopy.pm to cope with disappearing files. When copying from an active database tree, it's possible for files to be deleted after we see them in a readdir() scan but before we can open them. (Once we've got a file open, we don't expect any further errors from it getting unlinked, though.) Tweak RecursiveCopy so it can cope with this case, so as to avoid irreproducible test failures. Back-patch to 9.6 where this code was added. In v10 and HEAD, also remove unused "use RecursiveCopy" in one recovery test script. Michael Paquier and Tom Lane Discussion: https://postgr.es/m/24621.1504924323@sss.pgh.pa.us	2017-09-11 22:02:58 -04:00
Alvaro Herrera	bd75335a83	Fix translatable string Discussion: https://postgr.es/m/20170828130545.sdajqlpr37hmmd6a@alvherre.pgsql	2017-09-04 11:09:51 +02:00
Tom Lane	a4b0a4d437	Fix macro-redefinition warning on MSVC. In commit `9d6b160d7`, I tweaked pg_config.h.win32 to use "#define HAVE_LONG_LONG_INT_64 1" rather than defining it as empty, for consistency with what happens in an autoconf'd build. But Solution.pm injects another definition of that macro into ecpg_config.h, leading to justifiable (though harmless) compiler whining. Make that one consistent too. Back-patch, like the previous patch. Discussion: https://postgr.es/m/CAEepm=1dWsXROuSbRg8PbKLh0S=8Ou-V8sr05DxmJOF5chBxqQ@mail.gmail.com	2017-09-03 11:01:08 -04:00
Tom Lane	3a0f8e7d3f	Make [U]INT64CONST safe for use in #if conditions. Instead of using a cast to force the constant to be the right width, assume we can plaster on an L, UL, LL, or ULL suffix as appropriate. The old approach to this is very hoary, dating from before we were willing to require compilers to have working int64 types. This fix makes the PG_INT64_MIN, PG_INT64_MAX, and PG_UINT64_MAX constants safe to use in preprocessor conditions, where a cast doesn't work. Other symbolic constants that might be defined using [U]INT64CONST are likewise safer than before. Also fix the SIZE_MAX macro to be similarly safe, if we are forced to provide a definition for that. The test added in commit `2e70d6b5e` happens to do what we want even with the hack "(size_t) -1" definition, but we could easily get burnt on other tests in future. Back-patch to all supported branches, like the previous commits. Discussion: https://postgr.es/m/15883.1504278595@sss.pgh.pa.us	2017-09-01 15:14:18 -04:00
Tom Lane	e50d401a83	Ensure SIZE_MAX can be used throughout our code. Pre-C99 platforms may lack <stdint.h> and thereby SIZE_MAX. We have a couple of places using the hack "(size_t) -1" as a fallback, but it wasn't universally available; which means the code added in commit `2e70d6b5e` fails to compile everywhere. Move that hack to c.h so that we can rely on having SIZE_MAX everywhere. Per discussion, it'd be a good idea to make the macro's value safe for use in #if-tests, but that will take a bit more work. This is just a quick expedient to get the buildfarm green again. Back-patch to all supported branches, like the previous commit. Discussion: https://postgr.es/m/15883.1504278595@sss.pgh.pa.us	2017-09-01 13:52:53 -04:00
Tom Lane	bc95e5874a	Teach libpq to detect integer overflow in the row count of a PGresult. Adding more than 1 billion rows to a PGresult would overflow its ntups and tupArrSize fields, leading to client crashes. It'd be desirable to use wider fields on 64-bit machines, but because all of libpq's external APIs use plain "int" for row counters, that's going to be hard to accomplish without an ABI break. Given the lack of complaints so far, and the general pain that would be involved in using such huge PGresults, let's settle for just preventing the overflow and reporting a useful error message if it does happen. Also, for a couple more lines of code we can increase the threshold of trouble from INT_MAX/2 to INT_MAX rows. To do that, refactor pqAddTuple() to allow returning an error message that replaces the default assumption that it failed because of out-of-memory. Along the way, fix PQsetvalue() so that it reports all failures via pqInternalNotice(). It already did so in the case of bad field number, but neglected to report anything for other error causes. Because of the potential for crashes, this seems like a back-patchable bug fix, despite the lack of field reports. Michael Paquier, per a complaint from Igor Korot. Discussion: https://postgr.es/m/CA+FnnTxyLWyjY1goewmJNxC==HQCCF4fKkoCTa9qR36oRAHDPw@mail.gmail.com	2017-08-29 15:18:01 -04:00
Tom Lane	254bb39b72	Stamp 9.6.5.	2017-08-28 17:21:42 -04:00
Peter Eisentraut	8e80a5e25e	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: d8e8b1a6b85b2fc2d39dcf97f8f8ec436554cc91	2017-08-28 10:17:39 -04:00
Peter Eisentraut	0cd9071130	Fix outdated comment Author: Thomas Munro <thomas.munro@enterprisedb.com>	2017-08-23 14:20:18 -04:00
Peter Eisentraut	562ac27193	Fix translation marker This was erroneously removed in `55a70a023c`.	2017-08-23 09:58:38 -04:00
Andres Freund	6c036d0108	Backpatch introduction of TupleDescAttr(tupdesc, i). `2cd7084524` / `c6293249d` change the way individual attributes in a TupleDesc are stored / accessed. To reduce the effort of making extensions compatible with postgresql 11, and to ease future backpatching, backpatch introduction of TupleDescAttr() to all releases. Do not backpatch change in storage, as that'd be a breaking change for existing and working extensions. Author: Andres Freund Discussion: https://postgr.es/m/20170820181723.tdswdinzptbcwhrr@alap3.anarazel.de Backpatch: 9.2-	2017-08-22 07:47:42 -07:00
Tom Lane	41803d55a2	Fix possible core dump in parallel restore when using a TOC list. Commit `3eb9a5e7c` unintentionally introduced an ordering dependency into restore_toc_entries_prefork(). The existing coding of reduce_dependencies() contains a check to skip moving a TOC entry to the ready_list if it wasn't initially in the pending_list. This used to suffice to prevent reduce_dependencies() from trying to move anything into the ready_list during restore_toc_entries_prefork(), because the pending_list stayed empty throughout that phase; but it no longer does. The problem doesn't manifest unless the TOC has been reordered by SortTocFromFile, which is how I missed it in testing. To fix, just add a test for ready_list == NULL, converting the call with NULL from a poor man's sanity check into an explicit command not to touch TOC items' list membership. Clarify some of the comments around this; in particular, note the primary purpose of the check for pending_list membership, which is to ensure that we can't try to restore the same item twice, in case a TOC list forces it to be restored before its dependency count goes to zero. Per report from Fabrízio de Royes Mello. Back-patch to 9.3, like the previous commit. Discussion: https://postgr.es/m/CAFcNs+pjuv0JL_x4+=71TPUPjdLHOXA4YfT32myj_OrrZb4ohA@mail.gmail.com	2017-08-19 13:39:38 -04:00
Tom Lane	c343314882	Further tweaks to compiler flags for PL/Perl on Windows. It now emerges that we can only rely on Perl to tell us we must use -D_USE_32BIT_TIME_T if it's Perl 5.13.4 or later. For older versions, revert to our previous practice of assuming we need that symbol in all 32-bit Windows builds. This is not ideal, but inquiring into which compiler version Perl was built with seems far too fragile. In any case, we had not previously had complaints about these old Perl versions, so let's assume this is Good Enough. (It's still better than the situation ante commit `5a5c2feca`, in that at least the effects are confined to PL/Perl rather than the whole PG build.) Back-patch to all supported versions, like `5a5c2feca` and predecessors. Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com	2017-08-17 13:14:06 -04:00
Michael Meskes	3d7a1e2b96	Changed ecpg parser to allow RETURNING clauses without attached C variables.	2017-08-16 13:28:14 +02:00
Michael Meskes	954490fecb	Allow continuation lines in ecpg cppline parsing.	2017-08-16 13:28:10 +02:00
Peter Eisentraut	d01fc51c00	Initialize replication_slot_catalog_xmin in procarray Although not confirmed and probably rare, if the newly allocated memory is not already zero, this could possibly have caused some problems. Also reorder the initializations slightly so they match the order of the struct definition. Author: Wong, Yi Wen <yiwong@amazon.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>	2017-08-15 21:06:11 -04:00
Peter Eisentraut	dce90c7c8c	Include foreign tables in information_schema.table_privileges This appears to have been an omission in the original commit `0d692a0dc9`. All related information_schema views already include foreign tables. Reported-by: Nicolas Thauvin <nicolas.thauvin@dalibo.com>	2017-08-15 19:31:06 -04:00
Peter Eisentraut	62c9eaf553	psql: Add tab completion for \pset pager Author: Pavel Stehule <pavel.stehule@gmail.com>	2017-08-15 19:11:37 -04:00
Peter Eisentraut	c4dd62db19	Fix whitespace	2017-08-15 10:17:35 -04:00
Tom Lane	624b6f328a	Handle elog(FATAL) during ROLLBACK more robustly. Stress testing by Andreas Seltenreich disclosed longstanding problems that occur if a FATAL exit (e.g. due to receipt of SIGTERM) occurs while we are trying to execute a ROLLBACK of an already-failed transaction. In such a case, xact.c is in TBLOCK_ABORT state, so that AbortOutOfAnyTransaction would skip AbortTransaction and go straight to CleanupTransaction. This led to an assert failure in an assert-enabled build (due to the ROLLBACK's portal still having a cleanup hook) or without assertions, to a FATAL exit complaining about "cannot drop active portal". The latter's not disastrous, perhaps, but it's messy enough to want to improve it. We don't really want to run all of AbortTransaction in this code path. The minimum required to clean up the open portal safely is to do AtAbort_Memory and AtAbort_Portals. It seems like a good idea to do AtAbort_Memory unconditionally, to be entirely sure that we are starting with a safe CurrentMemoryContext. That means that if the main loop in AbortOutOfAnyTransaction does nothing, we need an extra step at the bottom to restore CurrentMemoryContext = TopMemoryContext, which I chose to do by invoking AtCleanup_Memory. This'll result in calling AtCleanup_Memory twice in many of the paths through this function, but that seems harmless and reasonably inexpensive. The original motivation for the assertion in AtCleanup_Portals was that we wanted to be sure that any user-defined code executed as a consequence of the cleanup hook runs during AbortTransaction not CleanupTransaction. That still seems like a valid concern, and now that we've seen one case of the assertion firing --- which means that exactly that would have happened in a production build --- let's replace the Assert with a runtime check. If we see the cleanup hook still set, we'll emit a WARNING and just drop the hook unexecuted. This has been like this a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/877ey7bmun.fsf@ansel.ydns.eu	2017-08-14 15:43:20 -04:00
Tom Lane	3883be3eae	Absorb -D_USE_32BIT_TIME_T switch from Perl, if relevant. Commit 3c163a7fc's original choice to ignore all #define symbols whose names begin with underscore turns out to be too simplistic. On Windows, some Perl installations are built with -D_USE_32BIT_TIME_T, and we must absorb that or we get the wrong result for sizeof(PerlInterpreter). This effectively re-reverts commit `ef58b87df`, which injected that symbol in a hacky way, making it apply to all of Postgres not just PL/Perl. More significantly, it did so on all 32-bit Windows builds, even when the Perl build to be used did not select this option; so that it fails to work properly with some newer Perl builds. By making this change, we would be introducing an ABI break in 32-bit Windows builds; but fortunately we have not used type time_t in any exported Postgres APIs in a long time. So it should be OK, both for PL/Perl itself and for third-party extensions, if an extension library is built with a different _USE_32BIT_TIME_T setting than the core code. Patch by me, based on research by Ashutosh Sharma and Robert Haas. Back-patch to all supported branches, as commit `3c163a7fc` was. Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com	2017-08-14 11:48:59 -04:00
Tom Lane	a64b5a9927	Remove AtEOXact_CatCache(). The sole useful effect of this function, to check that no catcache entries have positive refcounts at transaction end, has really been obsolete since we introduced ResourceOwners in PG 8.1. We reduced the checks to assertions years ago, so that the function was a complete no-op in production builds. There have been previous discussions about removing it entirely, but consensus up to now was that it had some small value as a cross-check for bugs in the ResourceOwner logic. However, it now emerges that it's possible to trigger these assertions if you hit an assert-enabled backend with SIGTERM during a call to SearchCatCacheList, because that function temporarily increases the refcounts of entries it's intending to add to a catcache list construct. In a normal ERROR scenario, the extra refcounts are cleaned up by SearchCatCacheList's PG_CATCH block; but in a FATAL exit we do a transaction abort and exit without ever executing PG_CATCH handlers. There's a case to be made that this is a generic hazard and we should consider restructuring elog(FATAL) handling so that pending PG_CATCH handlers do get run. That's pretty scary though: it could easily create more problems than it solves. Preliminary stress testing by Andreas Seltenreich suggests that there are not many live problems of this ilk, so we rejected that idea. There are more-localized ways to fix the problem; the most principled one would be to use PG_ENSURE_ERROR_CLEANUP instead of plain PG_TRY. But adding cycles to SearchCatCacheList isn't very appealing. We could also weaken the assertions in AtEOXact_CatCache in some more or less ad-hoc way, but that just makes its raison d'etre even less compelling. In the end, the most reasonable solution seems to be to just remove AtEOXact_CatCache altogether, on the grounds that it's not worth trying to fix it. It hasn't found any bugs for us in many years. Per report from Jeevan Chalke. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAM2+6=VEE30YtRQCZX7_sCFsEpoUkFBV1gZazL70fqLn8rcvBA@mail.gmail.com	2017-08-13 16:15:14 -04:00
Tom Lane	e2e398473e	Fix handling of container types in find_composite_type_dependencies. find_composite_type_dependencies correctly found columns that are of the specified type, and columns that are of arrays of that type, but not columns that are domains or ranges over the given type, its array type, etc. The most general way to handle this seems to be to assume that any type that is directly dependent on the specified type can be treated as a container type, and processed recursively (allowing us to handle nested cases such as ranges over domains over arrays ...). Since a type's array type already has such a dependency, we can drop the existing special case for the array type. The very similar logic in get_rels_with_domain was likewise a few bricks shy of a load, as it supposed that a directly dependent type could only be a sub-domain. This is already wrong for ranges over domains, and it'll someday be wrong for arrays over domains. Add test cases illustrating the problems, and back-patch to all supported branches. Discussion: https://postgr.es/m/15268.1502309024@sss.pgh.pa.us	2017-08-09 17:03:09 -04:00
Tom Lane	fe578cbd4b	Fix datumSerialize infrastructure to not crash on non-varlena data. Commit `1efc7e538` did a poor job of emulating existing logic for touching Datums that might be expanded-object pointers. It didn't check for typlen being -1 first, which meant it could crash on fixed-length pass-by-ref values, and probably on cstring values as well. It also didn't use DatumGetPointer before VARATT_IS_EXTERNAL_EXPANDED, which while currently harmless is not according to documentation nor prevailing style. I also think the lack of any explanation as to why datumSerialize makes these particular nonobvious choices is pretty awful, so fix that. Per report from Jarred Ward. Back-patch to 9.6 where this code came in. Discussion: https://postgr.es/m/6F61E6D2-2F5E-4794-9479-A429BE1CEA4B@simple.com	2017-08-08 19:18:23 -04:00
Alvaro Herrera	1924ef4166	Reword some unclear comments	2017-08-08 18:48:25 -04:00
Tom Lane	eca2f8a7dd	Stamp 9.6.4.	2017-08-07 17:10:58 -04:00
Peter Eisentraut	aa0f366d55	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: d81b8e4ab322171b7ea691c01513ede1cf398404	2017-08-07 13:42:43 -04:00
Tom Lane	52a414387e	Require update permission for the large object written by lo_put(). lo_put() surely should require UPDATE permission, the same as lowrite(), but it failed to check for that, as reported by Chapman Flack. Oversight in commit c50b7c09d; backpatch to 9.4 where that was introduced. Tom Lane and Michael Paquier Security: CVE-2017-7548	2017-08-07 10:19:20 -04:00
Noah Misch	1560996303	Again match pg_user_mappings to information_schema.user_mapping_options. Commit `3eefc51053` claimed to make pg_user_mappings enforce the qualifications user_mapping_options had been enforcing, but its removal of a longstanding restriction left them distinct when the current user is the subject of a mapping yet has no server privileges. user_mapping_options emits no rows for such a mapping, but pg_user_mappings includes full umoptions. Change pg_user_mappings to show null for umoptions. Back-patch to 9.2, like the above commit. Reviewed by Tom Lane. Reported by Jeff Janes. Security: CVE-2017-7547	2017-08-07 07:09:31 -07:00
Heikki Linnakangas	f6fc72cb69	Don't allow logging in with empty password. Some authentication methods allowed it, others did not. In the client-side, libpq does not even try to authenticate with an empty password, which makes using empty passwords hazardous: an administrator might think that an account with an empty password cannot be used to log in, because psql doesn't allow it, and not realize that a different client would in fact allow it. To clear that confusion and to be be consistent, disallow empty passwords in all authentication methods. All the authentication methods that used plaintext authentication over the wire, except for BSD authentication, already checked that the password received from the user was not empty. To avoid forgetting it in the future again, move the check to the recv_password_packet function. That only forbids using an empty password with plaintext authentication, however. MD5 and SCRAM need a different fix: * In stable branches, check that the MD5 hash stored for the user does not not correspond to an empty string. This adds some overhead to MD5 authentication, because the server needs to compute an extra MD5 hash, but it is not noticeable in practice. * In HEAD, modify CREATE and ALTER ROLE to clear the password if an empty string, or a password hash that corresponds to an empty string, is specified. The user-visible behavior is the same as in the stable branches, the user cannot log in, but it seems better to stop the empty password from entering the system in the first place. Secondly, it is fairly expensive to check that a SCRAM hash doesn't correspond to an empty string, because computing a SCRAM hash is much more expensive than an MD5 hash by design, so better avoid doing that on every authentication. We could clear the password on CREATE/ALTER ROLE also in stable branches, but we would still need to check at authentication time, because even if we prevent empty passwords from being stored in pg_authid, there might be existing ones there already. Reported by Jeroen van der Ham, Ben de Graaff and Jelte Fennema. Security: CVE-2017-7546	2017-08-07 17:03:49 +03:00
Andres Freund	32d7480e02	Fix thinko introduced in `2bef06d516` et al. The callers for GetOldestSafeDecodingTransactionId() all inverted the argument for the argument introduced in `2bef06d516`. Luckily this appears to be inconsequential for the moment, as we wait for concurrent in-progress transaction when assembling a snapshot. Additionally this could only make a difference when adding a second logical slot, because only a pre-existing slot could cause an issue by lowering the returned xid dangerously much. Reported-By: Antonin Houska Discussion: https://postgr.es/m/32704.1496993134@localhost Backport: 9.4-, where `2bef06d516` was backpatched to.	2017-08-06 14:21:19 -07:00
Tom Lane	b798ea88ac	Disallow SSL session tickets. We don't actually support session tickets, since we do not create an SSL session identifier. But it seems that OpenSSL will issue a session ticket on-demand anyway, which will then fail when used. This results in reconnection failures when using ticket-aware client-side SSL libraries (such as the Npgsql .NET driver), as reported by Shay Rojansky. To fix, just tell OpenSSL not to issue tickets. At some point in the far future, we might consider enabling tickets instead. But the security implications of that aren't entirely clear; and besides it would have little benefit except for very short-lived database connections, which is Something We're Bad At anyhow. It would take a lot of other work to get to a point where that would really be an exciting thing to do. While at it, also tell OpenSSL not to use a session cache. This doesn't really do anything, since a backend would never populate the cache anyway, but it might gain some micro-efficiencies and/or reduce security exposures. Patch by me, per discussion with Heikki Linnakangas and Shay Rojansky. Back-patch to all supported versions. Discussion: https://postgr.es/m/CADT4RqBU8N-csyZuzaook-c795dt22Zcwg1aHWB6tfVdAkodZA@mail.gmail.com	2017-08-04 11:07:10 -04:00
Peter Eisentraut	df04db0413	Add missing ALTER USER variants ALTER USER ... SET did not support all the syntax variants of ALTER ROLE ... SET. Reported-by: Pavel Golub <pavel@microolap.com>	2017-08-03 20:49:07 -04:00
Tom Lane	3d76328298	Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit `dc0e76ca3`). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us	2017-08-03 17:36:41 -04:00
Alvaro Herrera	611840074a	Fix build on zlib-less environments Commit `4d57e83816` added support for getting I/O errors out of zlib, but it introduced a portability problem for systems without zlib. Repair by wrapping the zlib call inside #ifdef and restore the original code in the other branch. This serves to illustrate the inadequacy of the zlib abstraction in pg_backup_archiver: there is no way to call gzerror() in that abstraction. This means that the several places that call GZREAD and GZWRITE are currently doing error reporting wrongly, but ENOTIME to get it fixed before next week's release set. Backpatch to 9.4, like the commit that introduced the problem.	2017-08-03 14:55:18 -04:00
Robert Haas	1f220c3907	Allow a foreign table CHECK constraint to be initially NOT VALID. For a table, the constraint can be considered validated immediately, because the table must be empty. But for a foreign table this is not necessarily the case. Fixes a bug in commit `f27a6b15e6`. Amit Langote, with some changes by me. Discussion: http://postgr.es/m/d2b7419f-4a71-cf86-cc99-bfd0f359a1ea@lab.ntt.co.jp	2017-08-03 13:25:32 -04:00
Alvaro Herrera	060393f2a1	Fix pg_dump's errno checking for zlib I/O Some error reports were reporting strerror(errno), which for some error conditions coming from zlib are wrong, resulting in confusing reports such as pg_restore: [compress_io] could not read from input file: Success which makes no sense. To correctly extract the error message we need to use gzerror(), so let's do that. This isn't as comprehensive or as neat as I would like, but at least it should improve things in many common cases. The zlib abstraction in compress_io does not seem to be applied consistently enough; we could perhaps improve that, but it seems master-only material, not a bug fix for back-patching. This problem goes back all the way, but I decided to apply back to 9.4 only, because older branches don't contain commit `14ea89366` which this change depends on. Authors: Vladimir Kunschikov, Álvaro Herrera Discussion: https://postgr.es/m/1498120508308.9826@infotecs.ru	2017-08-02 18:26:58 -04:00
Tom Lane	cf9da98600	Remove broken and useless entry-count printing in HASH_DEBUG code. init_htab(), with #define HASH_DEBUG, prints a bunch of hashtable parameters. It used to also print nentries, but commit `44ca4022f` changed that to "hash_get_num_entries(hctl)", which is wrong (the parameter should be "hashp"). Rather than correct the coding, though, let's just remove that field from the printout. The table must be empty, since we just finished building it, so expensively calculating the number of entries is rather pointless. Moreover hash_get_num_entries makes assumptions (about not needing locks) which we could do without in debugging code. Noted by Choi Doo-Won in bug #14764. Back-patch to 9.6 where the faulty code was introduced. Discussion: https://postgr.es/m/20170802032353.8424.12274@wrigleys.postgresql.org	2017-08-02 12:16:57 -04:00
Tatsuo Ishii	bd8ff38760	Fix comment. XLByteToSeg and XLByteToPrevSeg calculate only a segment number. The definition of these macros were modified by commit `dfda6ebaec` but the comment remain unchanged. Patch by Yugo Nagata. Back patched to 9.3 and beyond.	2017-08-01 08:08:32 +09:00
Tom Lane	1e58c503ec	PL/Perl portability fix: absorb relevant -D switches from Perl. Back-patch of commit `3c163a7fc7`, which see for more info. Also throw in commit `b4cc35fbb7`, so Coverity doesn't whine about the back branches. Ashutosh Sharma, some adjustments by me Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com	2017-07-31 12:38:35 -04:00
Tom Lane	30a5c8bfbd	PL/Perl portability fix: avoid including XSUB.h in plperl.c. Back-patch of commit `bebe174bb4`, which see for more info. Patch by me, with some help from Ashutosh Sharma Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com	2017-07-31 12:10:36 -04:00
Stephen Frost	d38e706ff1	Fix function comment for dumpACL() The comment for dumpACL() got neglected when initacls and initracls were added and the discussion of what 'racls' is wasn't very clear either. Per complaint from Tom.	2017-07-31 10:37:12 -04:00
Tatsuo Ishii	bc37d2460f	Add missing comment in postgresql.conf. current_source requires to restart server to reflect the new value. Per Yugo Nagata and Masahiko Sawada. Back patched to 9.2 and beyond.	2017-07-31 11:27:58 +09:00
Tatsuo Ishii	bb19bcd426	Add missing comment in postgresql.conf. dynamic_shared_memory_type requires to restart server to reflect the new value. Per Yugo Nagata and Masahiko Sawada. Back pached to 9.4 and beyond.	2017-07-31 11:10:18 +09:00
Tom Lane	157adfdf4a	Fix psql tab completion for CREATE USER MAPPING. After typing CREATE USER M..., it would not fill in MAPPING FOR, even though that was clearly intended behavior. Jeff Janes Discussion: https://postgr.es/m/CAMkU=1wo2iQ6jWnN=egqOb5NxEPn0PpANEtKHr3uPooQ+nYPtw@mail.gmail.com	2017-07-27 14:13:15 -04:00
Tom Lane	a2fc3431c3	Clean up SQL emitted by psql/describe.c. Fix assorted places that had not bothered with the convention of prefixing catalog and function names with "pg_catalog.". That could possibly result in query failure when running with a nondefault search_path. Also fix two places that weren't quoting OID literals. I think the latter hasn't mattered much since about 7.3, but it's still a bad idea to be doing it in 99 places and not in 2 others. Also remove a useless EXISTS sub-select that someone had stuck into describeOneTableDetails' queries for child tables. We just got the OID out of pg_class, so I hardly see how checking that it exists in pg_class was doing anything helpful. In passing, try to improve the emitted formatting of a couple of these queries, though I didn't work really hard on that. And merge unnecessarily duplicative coding in some other places. Much of this was new in HEAD, but some was quite old; back-patch as appropriate.	2017-07-26 19:36:19 -04:00
Alvaro Herrera	8c348765f9	Fix concurrent locking of tuple update chain If several sessions are concurrently locking a tuple update chain with nonconflicting lock modes using an old snapshot, and they all succeed, it may happen that some of them fail because of restarting the loop (due to a concurrent Xmax change) and getting an error in the subsequent pass while trying to obtain a tuple lock that they already have in some tuple version. This can only happen with very high concurrency (where a row is being both updated and FK-checked by multiple transactions concurrently), but it's been observed in the field and can have unpleasant consequences such as an FK check failing to see a tuple that definitely exists: ERROR: insert or update on table "child_table" violates foreign key constraint "fk_constraint_name" DETAIL: Key (keyid)=(123456) is not present in table "parent_table". (where the key is observably present in the table). Discussion: https://postgr.es/m/20170714210011.r25mrff4nxjhmf3g@alvherre.pgsql	2017-07-26 17:24:16 -04:00
Alvaro Herrera	3b7bbee7b6	Make PostgresNode easily subclassable This module becomes much more useful if we allow it to be used as base class for external projects. To achieve this, change the exported get_new_node function into a class method instead, and use the standard Perl idiom of accepting the class as first argument. This method works as expected for subclasses. The standalone function is kept for backwards compatibility, though it could be removed in pg11. Author: Chap Flackman, based on an earlier patch from Craig Ringer Discussion: https://postgr.es/m/CAMsr+YF8kO+4+K-_U4PtN==2FndJ+5Bn6A19XHhMiBykEwv0wA@mail.gmail.com	2017-07-25 18:50:53 -04:00
Tom Lane	51865a0a01	Fix race condition in predicate-lock init code in EXEC_BACKEND builds. Trading a little too heavily on letting the code path be the same whether we were creating shared data structures or only attaching to them, InitPredicateLocks() inserted the "scratch" PredicateLockTargetHash entry unconditionally. This is just wrong if we're in a postmaster child, which would only reach this code in EXEC_BACKEND builds. Most of the time, the hash_search(HASH_ENTER) call would simply report that the entry already existed, causing no visible effect since the code did not bother to check for that possibility. However, if this happened while some other backend had transiently removed the "scratch" entry, then that other backend's eventual RestoreScratchTarget would suffer an assert failure; this appears to be the explanation for a recent failure on buildfarm member culicidae. In non-assert builds, there would be no visible consequences there either. But nonetheless this is a pretty bad bug for EXEC_BACKEND builds, for two reasons: 1. Each new backend would perform the hash_search(HASH_ENTER) call without holding any lock that would prevent concurrent access to the PredicateLockTargetHash hash table. This creates a low but certainly nonzero risk of corruption of that hash table. 2. In the event that the race condition occurred, by reinserting the scratch entry too soon, we were defeating the entire purpose of the scratch entry, namely to guarantee that transaction commit could move hash table entries around with no risk of out-of-memory failure. The odds of an actual OOM failure are quite low, but not zero, and if it did happen it would again result in corruption of the hash table. The user-visible symptoms of such corruption are a little hard to predict, but would presumably amount to misbehavior of SERIALIZABLE transactions that'd require a crash or postmaster restart to fix. To fix, just skip the hash insertion if IsUnderPostmaster. I also inserted a bunch of assertions that the expected things happen depending on whether IsUnderPostmaster is true. That might be overkill, since most comparable code in other functions isn't quite that paranoid, but once burnt twice shy. In passing, also move a couple of lines to places where they seemed to make more sense. Diagnosis of problem by Thomas Munro, patch by me. Back-patch to all supported branches. Discussion: https://postgr.es/m/10593.1500670709@sss.pgh.pa.us	2017-07-24 16:46:00 -04:00
Robert Haas	971faefc24	When WCOs are present, disable direct foreign table modification. If the user modifies a view that has CHECK OPTIONs and this gets translated into a modification to an underlying relation which happens to be a foreign table, the check options should be enforced. In the normal code path, that was happening properly, but it was not working properly for "direct" modification because the whole operation gets pushed to the remote side in that case and we never have an option to enforce the constraint against individual tuples. Fix by disabling direct modification when there is a need to enforce CHECK OPTIONs. Etsuro Fujita, reviewed by Kyotaro Horiguchi and by me. Discussion: http://postgr.es/m/f8a48f54-6f02-9c8a-5250-9791603171ee@lab.ntt.co.jp	2017-07-24 16:24:42 -04:00
Tom Lane	3a07ba1285	Ensure that pg_get_ruledef()'s output matches pg_get_viewdef()'s. Various cases involving renaming of view columns are handled by having make_viewdef pass down the view's current relation tupledesc to get_query_def, which then takes care to use the column names from the tupledesc for the output column names of the SELECT. For some reason though, we'd missed teaching make_ruledef to do similarly when it is printing an ON SELECT rule, even though this is exactly the same case. The results from pg_get_ruledef would then be different and arguably wrong. In particular, this breaks pre-v10 versions of pg_dump, which in some situations would define views by means of emitting a CREATE RULE ... ON SELECT command. Third-party tools might not be happy either. In passing, clean up some crufty code in make_viewdef; we'd apparently modernized the equivalent code in make_ruledef somewhere along the way, and missed this copy. Per report from Gilles Darold. Back-patch to all supported versions. Discussion: https://postgr.es/m/ec05659a-40ff-4510-fc45-ca9d965d0838@dalibo.com	2017-07-24 15:16:31 -04:00
Noah Misch	bcc2c3b457	MSVC: Accept tcl86.lib in addition to tcl86t.lib. ActiveTcl8.6.4.1.299124-win32-x86_64-threaded.exe ships just tcl86.lib. Back-patch to 9.2, like the commit recognizing tcl86t.lib.	2017-07-23 23:53:37 -07:00
Tom Lane	82ebda7fff	Fix pg_dump's handling of event triggers. pg_dump with the --clean option failed to emit DROP EVENT TRIGGER commands for event triggers. In a closely related oversight, it also did not emit ALTER OWNER commands for event triggers. Since only superusers can create event triggers, the latter oversight is of little practical consequence ... but if we're going to record an owner for event triggers, then surely pg_dump should preserve it. Per complaint from Greg Atkins. Back-patch to 9.3 where event triggers were introduced. Discussion: https://postgr.es/m/20170722191142.yi4e7tzcg3iacclg@gmail.com	2017-07-22 20:20:09 -04:00
Robert Haas	73fbf3d3d0	pg_rewind: Fix some problems when copying files >2GB. When incrementally updating a file larger than 2GB, the old code could either fail outright (if the client asked the server for bytes beyond the 2GB boundary) or fail to copy all the blocks that had actually been modified (if the server reported a file size to the client in excess of 2GB), resulting in data corruption. Generally, such files won't occur anyway, but they might if using a non-default segment size or if there the directory contains stray files unrelated to PostgreSQL. Fix by a more prudent choice of data types. Even with these improvements, this code still uses a mix of different types (off_t, size_t, uint64, int64) to represent file sizes and offsets, not all of which necessarily have the same width or signedness, so further cleanup might be in order here. However, at least now they all have the potential to be 64 bits wide on 64-bit platforms. Kuntal Ghosh and Michael Paquier, with a tweak by me. Discussion: http://postgr.es/m/CAGz5QC+8gbkz=Brp0TgoKNqHWTzonbPtPex80U0O6Uh_bevbaA@mail.gmail.com	2017-07-21 22:04:55 -04:00
Alvaro Herrera	afd56b8521	Fix typo in comment Commit `fd31cd2651` renamed the variable to skipping_blocks, but forgot to update this comment. Noticed while inspecting code.	2017-07-21 20:07:21 -04:00
Robert Haas	0fe21602e2	pg_rewind: Fix busted sanity check. As written, the code would only fail the sanity check if none of the columns returned by the server were of the expected type, but we want it to fail if even one column is not of the expected type. Discussion: http://postgr.es/m/CA+TgmoYuY5zW7JEs+1hSS1D=V5K8h1SQuESrq=bMNeo0B71Sfw@mail.gmail.com	2017-07-21 13:04:04 -04:00
Teodor Sigaev	38a4a5349c	Fix double shared memory allocation. SLRU buffer lwlocks are allocated twice by oversight in commit `fe702a7b3f` where that locks were moved to separate tranche. The bug doesn't have user-visible effects except small overspending of shared memory. Backpatch to 9.6 where it was introduced. Alexander Korotkov with small editorization by me.	2017-07-21 13:31:49 +03:00
Tom Lane	41ada83774	Fix dumping of outer joins with empty qual lists. Normally, a JoinExpr would have empty "quals" only if it came from CROSS JOIN syntax. However, it's possible to get to this state by specifying NATURAL JOIN between two tables with no common column names, and there might be other ways too. The code previously printed no ON clause if "quals" was empty; that's right for CROSS JOIN but syntactically invalid if it's some type of outer join. Fix by printing ON TRUE in that case. This got broken by commit `2ffa740be`, which stopped using NATURAL JOIN syntax in ruleutils output due to its brittleness in the face of column renamings. Back-patch to 9.3 where that commit appeared. Per report from Tushar Ahuja. Discussion: https://postgr.es/m/98b283cd-6dda-5d3f-f8ac-87db8c76a3da@enterprisedb.com	2017-07-20 11:29:36 -04:00
Tom Lane	d46403d2f3	Merge large_object.sql test into largeobject.source. It seems pretty confusing to have tests named both largeobject and large_object. The latter is of very recent vintage (commit `ff992c074`), so get rid of it in favor of merging into the former. Also, enable the LO comment test that was added by commit `70ad7ed4e`, since the later commit added the then-missing pg_upgrade functionality. The large_object.sql test case is almost completely redundant with that, but not quite: it seems like creating a user-defined LO with an OID in the system range might be an interesting case for pg_upgrade, so let's keep it. Like the earlier patch, back-patch to all supported branches. Discussion: https://postgr.es/m/18665.1500306372@sss.pgh.pa.us	2017-07-17 15:28:16 -04:00
Andrew Dunstan	220a9b5e55	fix typo	2017-07-16 12:00:56 -04:00
Andrew Dunstan	b4a1d69ed4	Fix vcregress.pl PROVE_FLAGS bug in commit `93b7d9731f` This change didn't adjust the publicly visible taptest function, causing buildfarm failures on bowerbird. Backpatch to 9.4 like previous change.	2017-07-16 11:27:00 -04:00
Tom Lane	4e763fb6f6	Fix broken link-command-line ordering for libpgfeutils. In the frontend Makefiles that pull in libpgfeutils, we'd generally done it like this: LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) That method is badly broken, as seen in bug #14742 from Chris Ruprecht. The -L flag for src/fe_utils ends up being placed after whatever random -L flags are in LDFLAGS already. That puts us at risk of pulling in libpgfeutils.a from some previous installation rather than the freshly built one in src/fe_utils. Also, the lack of an "override" is hazardous if someone tries to specify some LDFLAGS on the make command line. The correct way to do it is like this: override LDFLAGS := -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(LDFLAGS) so that libpgfeutils, along with libpq, libpgport, and libpgcommon, are guaranteed to be pulled in from the build tree and not from any referenced system directory, because their -L flags will appear first. In some places we'd been even lazier and done it like this: LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils -lpq which is subtly wrong in an additional way: on platforms where we can't restrict the symbols exported by libpq.so, it allows libpgfeutils to latch onto libpgport and libpgcommon symbols from libpq.so, rather than directly from those static libraries as intended. This carries hazards like those explained in the comments for the libpq_pgport macro. In addition to fixing the broken libpgfeutils usages, I tried to standardize on using $(libpq_pgport) like so: override LDFLAGS := $(libpq_pgport) $(LDFLAGS) even where libpgfeutils is not in the picture. This makes no difference right now but will hopefully discourage future mistakes of the same ilk. And it's more like the way we handle CPPFLAGS in libpq-using Makefiles. In passing, just for consistency, make pgbench include PTHREAD_LIBS the same way everyplace else does, ie just after LIBS rather than in some random place in the command line. This might have practical effect if there are -L switches in that macro on some platform. It looks to me like the MSVC build scripts are not affected by this error, but someone more familiar with them than I might want to double check. Back-patch to 9.6 where libpgfeutils was introduced. In 9.6, the hazard this error creates is that a reinstallation might link to the prior installation's copy of libpgfeutils.a and thereby fail to absorb a minor-version bug fix. Discussion: https://postgr.es/m/20170714125106.9231.13772@wrigleys.postgresql.org	2017-07-14 12:26:53 -04:00
Heikki Linnakangas	cedd25ae88	Fix pg_basebackup output to stdout on Windows. When writing a backup to stdout with pg_basebackup on Windows, put stdout to binary mode. Any CR bytes in the output will otherwise be output incorrectly as CR+LF. In the passing, standardize on using "_setmode" instead of "setmode", for the sake of consistency. They both do the same thing, but according to MSDN documentation, setmode is deprecated. Fixes bug #14634, reported by Henry Boehlert. Patch by Haribabu Kommi. Backpatch to all supported versions. Discussion: https://www.postgresql.org/message-id/20170428082818.24366.13134@wrigleys.postgresql.org	2017-07-14 16:03:05 +03:00
Tom Lane	3b0c2dbed0	Fix dumping of FUNCTION RTEs that contain non-function-call expressions. The grammar will only accept something syntactically similar to a function call in a function-in-FROM expression. However, there are various ways to input something that ruleutils.c won't deparse that way, potentially leading to a view or rule that fails dump/reload. Fix by inserting a dummy CAST around anything that isn't going to deparse as a function (which is one of the ways to get something like that in there in the first place). In HEAD, also make use of the infrastructure added by this to avoid emitting unnecessary parentheses in CREATE INDEX deparsing. I did not change that in back branches, thinking that people might find it to be unexpected/unnecessary behavioral change. In HEAD, also fix incorrect logic for when to add extra parens to partition key expressions. Somebody apparently thought they could get away with simpler logic than pg_get_indexdef_worker has, but they were wrong --- a counterexample is PARTITION BY LIST ((a[1])). Ignoring the prettyprint flag for partition expressions isn't exactly a nice solution anyway. This has been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/10477.1499970459@sss.pgh.pa.us	2017-07-13 19:24:44 -04:00
Heikki Linnakangas	cb02cbc9d7	Fix race between GetNewTransactionId and GetOldestActiveTransactionId. The race condition goes like this: 1. GetNewTransactionId advances nextXid e.g. from 100 to 101 2. GetOldestActiveTransactionId reads the new nextXid, 101 3. GetOldestActiveTransactionId loops through the proc array. There are no active XIDs there, so it returns 101 as the oldest active XID. 4. GetNewTransactionid stores XID 100 to MyPgXact->xid So, GetOldestActiveTransactionId returned XID 101, even though 100 only just started and is surely still running. This would be hard to hit in practice, and even harder to spot any ill effect if it happens. GetOldestActiveTransactionId is only used when creating a checkpoint in a master server, and the race condition can only happen on an online checkpoint, as there are no backends running during a shutdown checkpoint. The oldestActiveXid value of an online checkpoint is only used when starting up a hot standby server, to determine the starting point where pg_subtrans is initialized from. For the race condition to happen, there must be no other XIDs in the proc array that would hold back the oldest-active XID value, which means that the missed XID must be a top transaction's XID. However, pg_subtrans is not used for top XIDs, so I believe an off-by-one error is in fact inconsequential. Nevertheless, let's fix it, as it's clearly wrong and the fix is simple. This has been wrong ever since hot standby was introduced, so backport to all supported versions. Discussion: https://www.postgresql.org/message-id/e7258662-82b6-7a45-56d4-99b337a32bf7@iki.fi	2017-07-13 15:47:49 +03:00
Tom Lane	ff2d537223	Fix ruleutils.c for domain-over-array cases, too. Further investigation shows that ruleutils isn't quite up to speed either for cases where we have a domain-over-array: it needs to be prepared to look past a CoerceToDomain at the top level of field and element assignments, else it decompiles them incorrectly. Potentially this would result in failure to dump/reload a rule, if it looked like the one in the new test case. (I also added a test for EXPLAIN; that output isn't broken, but clearly we need more test coverage here.) Like commit `b1cb32fb6`, this bug is reachable in cases we already support, so back-patch all the way.	2017-07-12 18:00:04 -04:00
Heikki Linnakangas	bbeec3c749	Reduce memory usage of tsvector type analyze function. compute_tsvector_stats() detoasted and kept in memory every tsvector value in the sample, but that can be a lot of memory. The original bug report described a case using over 10 gigabytes, with statistics target of 10000 (the maximum). To fix, allocate a separate copy of just the lexemes that we keep around, and free the detoasted tsvector values as we go. This adds some palloc/pfree overhead, when you have a lot of distinct lexemes in the sample, but it's better than running out of memory. Fixes bug #14654 reported by James C. Reviewed by Tom Lane. Backport to all supported versions. Discussion: https://www.postgresql.org/message-id/20170514200602.1451.46797@wrigleys.postgresql.org	2017-07-12 22:06:10 +03:00
Alvaro Herrera	832d3dce5a	commit_ts test: Set node name in test Otherwise, the script output has a lot of pointless warnings. This was forgotten in `9def031bd2`	2017-07-12 14:39:44 -04:00
Tom Lane	09c5988981	Avoid integer overflow while sifting-up a heap in tuplesort.c. If the number of tuples in the heap exceeds approximately INT_MAX/2, this loop's calculation "2*i+1" could overflow, resulting in a crash. Fix it by using unsigned int rather than int for the relevant local variables; that shouldn't cost anything extra on any popular hardware. Per bug #14722 from Sergey Koposov. Original patch by Sergey Koposov, modified by me per a suggestion from Heikki Linnakangas to use unsigned int not int64. Back-patch to 9.4, where tuplesort.c grew the ability to sort as many as INT_MAX tuples in-memory (commit `263865a48`). Discussion: https://postgr.es/m/20170629161637.1478.93109@wrigleys.postgresql.org	2017-07-12 13:24:16 -04:00
Heikki Linnakangas	0a2d07cf1e	Fix variable and type name in comment. Kyotaro Horiguchi Discussion: https://www.postgresql.org/message-id/20170711.163441.241981736.horiguchi.kyotaro@lab.ntt.co.jp	2017-07-12 17:09:04 +03:00
Heikki Linnakangas	941188a5ff	Fix ordering of operations in SyncRepWakeQueue to avoid assertion failure. Commit `14e8803f1` removed the locking in SyncRepWaitForLSN, but that introduced a race condition, where SyncRepWaitForLSN might see syncRepState already set to SYNC_REP_WAIT_COMPLETE, but the process was not yet removed from the queue. That tripped the assertion, that the process should no longer be in the uqeue. Reorder the operations in SyncRepWakeQueue to remove the process from the queue first, and update syncRepState only after that, and add a memory barrier in between to make sure the operations are made visible to other processes in that order. Fixes bug #14721 reported by Const Zhang. Analysis and fix by Thomas Munro. Backpatch down to 9.5, where the locking was removed. Discussion: https://www.postgresql.org/message-id/20170629023623.1480.26508%40wrigleys.postgresql.org	2017-07-12 15:35:31 +03:00
Heikki Linnakangas	f1edf84580	Remove unnecessary braces, to match the surrounding style. Mostly in the new subscription-related commands. Backport the few that were also present in older versions. Thomas Munro Discussion: https://www.postgresql.org/message-id/CAEepm=3CyW1QmXcXJXmqiJXtXzFDc8SvSfnxkEGD3Bkv2SrkeQ@mail.gmail.com	2017-07-12 12:31:14 +03:00
Tom Lane	1233680611	Fix multiple assignments to a column of a domain type. We allow INSERT and UPDATE commands to assign to the same column more than once, as long as the assignments are to subfields or elements rather than the whole column. However, this failed when the target column was a domain over array rather than plain array. Fix by teaching process_matched_tle() to look through CoerceToDomain nodes, and add relevant test cases. Also add a group of test cases exercising domains over array of composite. It's doubtless accidental that CREATE DOMAIN allows this case while not allowing straight domain over composite; but it does, so we'd better make sure we don't break it. (I could not find any documentation mentioning either side of that, so no doc changes.) It's been like this for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/4206.1499798337@sss.pgh.pa.us	2017-07-11 16:48:59 -04:00
Tom Lane	c0077f7383	On Windows, retry process creation if we fail to reserve shared memory. We've heard occasional reports of backend launch failing because pgwin32_ReserveSharedMemoryRegion() fails, indicating that something has already used that address space in the child process. It's not very clear what, given that we disable ASLR in Windows builds, but suspicion falls on antivirus products. It'd be better if we didn't have to disable ASLR, anyway. So let's try to ameliorate the problem by retrying the process launch after such a failure, up to 100 times. Patch by me, based on previous work by Amit Kapila and others. This is a longstanding issue, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAA4eK1+R6hSx6t_yvwtx+NRzneVp+MRqXAdGJZChcau8Uij-8g@mail.gmail.com	2017-07-10 11:00:09 -04:00
Alvaro Herrera	d61b39c7e2	Fix typo Noticed while reviewing code.	2017-07-07 17:10:36 -04:00
Teodor Sigaev	42f62e4c97	Fix potential data corruption during freeze Fix oversight in `3b97e6823b` bug fix. Bitwise AND is used instead of OR and it cleans all bits in t_infomask heap tuple field. Backpatch to 9.3	2017-07-06 17:19:44 +03:00
Heikki Linnakangas	f73382877e	Treat clean shutdown of an SSL connection same as the non-SSL case. If the client closes an SSL connection, treat it the same as EOF on a non-SSL connection. In particular, don't write a message in the log about that. Michael Paquier. Discussion: https://www.postgresql.org/message-id/CAB7nPqSfyVV42Q2acFo%3DvrvF2gxoZAMJLAPq3S3KkjhZAYi7aw@mail.gmail.com	2017-07-03 14:53:01 +03:00
Tom Lane	e9d4aa594f	Fix walsender to exit promptly if client requests shutdown. It's possible for WalSndWaitForWal to be asked to wait for WAL that doesn't exist yet. That's fine, in fact it's the normal situation if we're caught up; but when the client requests shutdown we should not keep waiting. The previous coding could wait indefinitely if the source server was idle. In passing, improve the rather weak comments in this area, and slightly rearrange some related code for better readability. Back-patch to 9.4 where this code was introduced. Discussion: https://postgr.es/m/14154.1498781234@sss.pgh.pa.us	2017-06-30 12:00:03 -04:00
Tom Lane	43c67e32fb	Second try at fixing tcp_keepalives_idle option on Solaris. Buildfarm evidence shows that TCP_KEEPALIVE_THRESHOLD doesn't exist after all on Solaris < 11. This means we need to take positive action to prevent the TCP_KEEPALIVE code path from being taken on that platform. I've chosen to limit it with "&& defined(__darwin__)", since it's unclear that anyone else would follow Apple's precedent of spelling the symbol that way. Also, follow a suggestion from Michael Paquier of eliminating code duplication by defining a couple of intermediate symbols for the socket option. In passing, make some effort to reduce the number of translatable messages by replacing "setsockopt(foo) failed" with "setsockopt(%s) failed", etc, throughout the affected files. And update relevant documentation so that it doesn't claim to provide an exhaustive list of the possible socket option names. Like the previous commit (`f0256c774`), back-patch to all supported branches. Discussion: https://postgr.es/m/20170627163757.25161.528@wrigleys.postgresql.org	2017-06-28 12:30:16 -04:00
Stephen Frost	a2de017b30	Do not require 'public' to exist for pg_dump -c Commit `330b84d8c4` didn't contemplate the case where the public schema has been dropped and introduced a query which fails when there is no public schema into pg_dump (when used with -c). Adjust the query used by pg_dump to handle the case where the public schema doesn't exist and add tests to check that such a case no longer fails. Back-patch the specific fix to 9.6, as the prior commit was. Adding tests for this case involved adding support to the pg_dump TAP tests to work with multiple databases, which, while not a large change, is a bit much to back-patch, so that's only done in master. Addresses bug #14650 Discussion: https://www.postgresql.org/message-id/20170512181801.1795.47483%40wrigleys.postgresql.org	2017-06-28 10:34:01 -04:00
Tom Lane	55968ed894	Support tcp_keepalives_idle option on Solaris. Turns out that the socket option for this is named TCP_KEEPALIVE_THRESHOLD, at least according to the tcp(7P) man page for Solaris 11. (But since that text refers to "SunOS", it's likely pretty ancient.) It appears that the symbol TCP_KEEPALIVE does get defined on that platform, but it doesn't seem to represent a valid protocol-level socket option. This leads to bleats in the postmaster log, and no tcp_keepalives_idle functionality. Per bug #14720 from Andrey Lizenko, as well as an earlier report from Dhiraj Chawla that nobody had followed up on. The issue's been there since we added the TCP_KEEPALIVE code path in commit `5acd417c8`, so back-patch to all supported branches. Discussion: https://postgr.es/m/20170627163757.25161.528@wrigleys.postgresql.org	2017-06-27 18:47:57 -04:00
Tom Lane	3a7bd59c44	Re-allow SRFs and window functions within sub-selects within aggregates. check_agg_arguments_walker threw an error upon seeing a SRF or window function, but that is too aggressive: if the function is within a sub-select then it's perfectly fine. I broke the SRF case in commit `0436f6bde` by copying the logic for window functions ... but that was broken too, and had been since commit `eaccfded9`. Repair both cases in HEAD, and the window function case back to 9.3. 9.2 gets this right.	2017-06-27 17:51:11 -04:00
Tom Lane	df31a9fc66	Reduce wal_retrieve_retry_interval in applicable TAP tests. By default, wal_retrieve_retry_interval is five seconds, which is far more than is needed in any of our TAP tests, leaving the test cases just twiddling their thumbs for significant stretches. Moreover, because it's so large, we get basically no testing of the retry-before- master-is-ready code path. Hence, make PostgresNode::init set up wal_retrieve_retry_interval = '500ms' as part of its customization of test clusters' postgresql.conf. This shaves quite a few seconds off the runtime of the recovery TAP tests. Back-patch into 9.6. We have wal_retrieve_retry_interval in 9.5, but the test infrastructure isn't there. Discussion: https://postgr.es/m/31624.1498500416@sss.pgh.pa.us	2017-06-26 19:01:26 -04:00
Tom Lane	a4d1ce095b	Don't lose walreceiver start requests due to race condition in postmaster. When a walreceiver dies, the startup process will notice that and send a PMSIGNAL_START_WALRECEIVER signal to the postmaster, asking for a new walreceiver to be launched. There's a race condition, which at least in HEAD is very easy to hit, whereby the postmaster might see that signal before it processes the SIGCHLD from the walreceiver process. In that situation, sigusr1_handler() just dropped the start request on the floor, reasoning that it must be redundant. Eventually, after 10 seconds (WALRCV_STARTUP_TIMEOUT), the startup process would make a fresh request --- but that's a long time if the connection could have been re-established almost immediately. Fix it by setting a state flag inside the postmaster that we won't clear until we do launch a walreceiver. In cases where that results in an extra walreceiver launch, it's up to the walreceiver to realize it's unwanted and go away --- but we have, and need, that logic anyway for the opposite race case. I came across this through investigating unexpected delays in the src/test/recovery TAP tests: it manifests there in test cases where a master server is stopped and restarted while leaving streaming slaves active. This logic has been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/21344.1498494720@sss.pgh.pa.us	2017-06-26 17:31:56 -04:00
Tom Lane	f6af9c749d	Ignore old stats file timestamps when starting the stats collector. The stats collector disregards inquiry messages that bear a cutoff_time before when it last wrote the relevant stats file. That's fine, but at startup when it reads the "permanent" stats files, it absorbed their timestamps as if they were the times at which the corresponding temporary stats files had been written. In reality, of course, there's no data out there at all. This led to disregarding inquiry messages soon after startup if the postmaster had been shut down and restarted within less than PGSTAT_STAT_INTERVAL; which is a pretty common scenario, both for testing and in the field. Requesting backends would hang for 10 seconds and then report failure to read statistics, unless they got bailed out by some other backend coming along and making a newer request within that interval. I came across this through investigating unexpected delays in the src/test/recovery TAP tests: it manifests there because the autovacuum launcher hangs for 10 seconds when it can't get statistics at startup, thus preventing a second shutdown from occurring promptly. We might want to do some things in the autovac code to make it less prone to getting stuck that way, but this change is a good bug fix regardless. In passing, also fix pgstat_read_statsfiles() to ensure that it re-zeroes its global stats variables if they are corrupted by a short read from the stats file. (Other reads in that function go into temp variables, so that the issue doesn't arise.) This has been broken since we created the separation between permanent and temporary stats files in 8.4, so back-patch to all supported branches. Discussion: https://postgr.es/m/16860.1498442626@sss.pgh.pa.us	2017-06-26 16:17:05 -04:00
Tom Lane	9bfb1f2d64	Minor code review for parse_phrase_operator(). Fix its header comment, which described the old behavior of the <N> phrase distance operator; we missed updating that in commit `028350f61`. Also, reset errno before strtol() call, to defend against the possibility that it was already ERANGE at entry. (The lack of complaints says that it generally isn't, but this is at least a latent bug.) Very minor stylistic improvements as well. Victor Drobny noted the obsolete comment, I noted the errno issue. Back-patch to 9.6 where this code was added, just in case the errno issue is a live bug in some cases. Discussion: https://postgr.es/m/2b5382fdff9b1f79d5eb2c99c4d2cbe2@postgrespro.ru	2017-06-26 10:31:19 -04:00
Alvaro Herrera	4947fced73	Fix typo in comment Once upon a time, WAL pointers could be NULL, but no longer. We talk about "valid" now. Reported-by: Amit Langote Discussion: https://postgr.es/m/33e9617d-27f1-eee8-3311-e27af98eaf2b@lab.ntt.co.jp	2017-06-22 16:42:38 -04:00
Andres Freund	39e30cbc16	Fix possibility of creating a "phantom" segment after promotion. When promoting a standby just after a XLOG_SWITCH record was replayed, and next segment(s) are already are locally available (via walsender, restore_command + trigger/recovery target), that segment could accidentally be recycled onto the past of the new timeline. Later checkpointer would create a .ready file for it, assuming there was an error during creation, and it would get archived. That causes trouble if another standby is later brought up from a basebackup from before the timeline creation, because it would try to read the segment, because XLogFileReadAnyTLI just tries all possible timelines, which doesn't have valid contents. Thus replay would fail. The problem, if already occurred, can be fixed by removing the segment and/or having restore_command filter it out. The reason for the creation of such "phantom" segments was, that after an XLOG_SWITCH record the EndOfLog variable points to the beginning of the next segment, and RemoveXlogFile() used XLByteToPrevSeg(). Normally RemoveXlogFile() doing so is harmless, because the last segment will still exist preventing InstallXLogFileSegment() from causing harm, but just after promotion there's no previous segment on the new timeline. Fix that by using XLByteToSeg() instead of XLByteToPrevSeg(). Author: Andres Freund Reported-By: Greg Burek Discussion: https://postgr.es/m/20170619073026.zcwpe6mydsaz5ygd@alap3.anarazel.de Backpatch: 9.2-, bug older than all supported versions	2017-06-21 14:14:38 -07:00
Heikki Linnakangas	6fd9930e68	Fix typo in comment. Etsuro Fujita	2017-06-21 11:55:21 +03:00
Bruce Momjian	0efdbd323e	pg_upgrade: start/stop new server after pg_resetwal When commit `0f33a719fd` removed the instructions to start/stop the new cluster before running rsync, it was now possible for pg_resetwal/pg_resetxlog to leave the final WAL record at wal_level=minimum, preventing upgraded standby servers from reconnecting. This patch fixes that by having pg_upgrade unconditionally start/stop the new cluster after pg_resetwal/pg_resetxlog has run. Backpatch through 9.2 since, though the instructions were added in PG 9.5, they worked all the way back to 9.2. Discussion: https://postgr.es/m/20170620171844.GC24975@momjian.us Backpatch-through: 9.2	2017-06-20 13:20:02 -04:00
Tom Lane	1dce053649	Fix materialized-view documentation oversights. When materialized views were added, psql's \d commands were made to treat them as a separate object category ... but not everyplace in the documentation or comments got the memo. Noted by David Johnston. Back-patch to 9.3 where matviews came in. Discussion: https://postgr.es/m/CAKFQuwb27M3VXRhHErjCpkWwN9eKThbqWb1=trtoXi9_ejqPXQ@mail.gmail.com	2017-06-19 18:32:22 -04:00
Tom Lane	1f184426b7	Avoid regressions in foreign-key-based selectivity estimates. David Rowley found that the "use the smallest per-column selectivity" heuristic applied in some cases by get_foreign_key_join_selectivity() was badly off if the FK columns are independent, producing estimates much worse than we got before that code was added in 9.6. One case where that heuristic was used was for LEFT and FULL outer joins with the referenced rel on the outside of the join. But we should not really need to special-case those here. eqjoinsel() never has had such a special case; the correction is applied by calc_joinrel_size_estimate() instead. Let's just estimate such cases like inner joins and rely on that later adjustment. (I think there was something of a thinko here, in that the comments seem to be thinking about the selectivity as defined for semi/anti joins; but that shouldn't apply to left/full joins.) Add a regression test exercising such a case to show that this is sane in at least some cases. The other case where we used that heuristic was for SEMI/ANTI outer joins, either if the referenced rel was on the outside, or if it was on the inside but was part of a join within the RHS. In either case, the FK doesn't give us a lot of traction towards estimating the selectivity. To ensure that we don't have regressions from what happened before 9.6, let's punt by ignoring the FK in such cases and applying the traditional selectivity calculation. (We might be able to improve on that later, but for now I just want to be sure it's not worse than 9.5.) Report and patch by David Rowley, simplified a bit by me. Back-patch to 9.6 where this code was added. Discussion: https://postgr.es/m/CAKJS1f8NO8oCDcxrteohG6O72uU1saEVT9qX=R8pENr5QWerXw@mail.gmail.com	2017-06-19 15:33:41 -04:00
Tom Lane	3ef40dcec6	On Windows, make pg_dump use binary mode for compressed plain text output. The combination of -Z -Fp and output to stdout resulted in corrupted output data, because we left stdout in text mode, resulting in newline conversion being done on the compressed stream. Switch stdout to binary mode for this case, at the same place where we do it for non-text output formats. Report and patch by Kuntal Ghosh, tested by Ashutosh Sharma and Neha Sharma. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAGz5QCJPvbBjXAmJuGx1B_41yVCetAJhp7rtaDf7XQGWuB1GSw@mail.gmail.com	2017-06-19 11:03:02 -04:00
Andres Freund	810344436d	Fix leaking of small spilled subtransactions during logical decoding. When, during logical decoding, a transaction gets too big, it's contents get spilled to disk. Not just the top-transaction gets spilled, but also all of its subtransactions, even if they're not that large themselves. Unfortunately we didn't clean up such small spilled subtransactions from disk. Fix that, by keeping better track of whether a transaction has been spilled to disk. Author: Andres Freund Reported-By: Dmitriy Sarafannikov, Fabrízio de Royes Mello Discussion: https://postgr.es/m/1457621358.355011041@f382.i.mail.ru https://postgr.es/m/CAFcNs+qNMhNYii4nxpO6gqsndiyxNDYV0S=JNq0v_sEE+9PHXg@mail.gmail.com Backpatch: 9.4-, where logical decoding was introduced	2017-06-18 19:13:15 -07:00
Heikki Linnakangas	a9a5eb32b3	Fix dependency, when changing a function's argument/return type. When a new base type is created using the old-style procedure of first creating the input/output functions with "opaque" in place of the base type, the "opaque" argument/return type is changed to the final base type, on CREATE TYPE. However, we did not create a pg_depend record when doing that, so the functions were left not depending on the type. Fixes bug #14706, reported by Karen Huddleston. Discussion: https://www.postgresql.org/message-id/20170614232259.1424.82774@wrigleys.postgresql.org	2017-06-16 11:39:45 +03:00
Tom Lane	b217459877	Fix low-probability leaks of PGresult objects in the backend. We had three occurrences of essentially the same coding pattern wherein we tried to retrieve a query result from a libpq connection without blocking. In the case where PQconsumeInput failed (typically indicating a lost connection), all three loops simply gave up and returned, forgetting to clear any previously-collected PGresult object. Since those are malloc'd not palloc'd, the oversight results in a process-lifespan memory leak. One instance, in libpqwalreceiver, is of little significance because the walreceiver process would just quit anyway if its connection fails. But we might as well fix it. The other two instances, in postgres_fdw, are somewhat more worrisome because at least in principle the scenario could be repeated, allowing the amount of memory leaked to build up to something worth worrying about. Moreover, in these cases the loops contain CHECK_FOR_INTERRUPTS calls, as well as other calls that could potentially elog(ERROR), providing another way to exit without having cleared the PGresult. Here we need to add PG_TRY logic similar to what exists in quite a few other places in postgres_fdw. Coverity noted the libpqwalreceiver bug; I found the other two cases by checking all calls of PQconsumeInput. Back-patch to all supported versions as appropriate (9.2 lacks postgres_fdw, so this is really quite unexciting for that branch). Discussion: https://postgr.es/m/22620.1497486981@sss.pgh.pa.us	2017-06-15 15:03:53 -04:00
Tom Lane	318fd99ce7	Assert that we don't invent relfilenodes or type OIDs in binary upgrade. During pg_upgrade's restore run, all relfilenode choices should be overridden by commands in the dump script. If we ever find ourselves choosing a relfilenode in the ordinary way, someone blew it. Likewise for pg_type OIDs. Since pg_upgrade might well succeed anyway, if there happens not to be a conflict during the regression test run, we need assertions here to keep us on the straight and narrow. We might someday be able to remove the assertion in GetNewRelFileNode, if pg_upgrade is rewritten to remove its assumption that old and new relfilenodes always match. But it's hard to see how to get rid of the pg_type OID constraint, since those OIDs are embedded in user tables in some cases. Back-patch as far as 9.5, because of the risk of back-patches breaking something here even if it works in HEAD. I'd prefer to go back further, but 9.4 fails both assertions due to get_rel_infos()'s use of a temporary table. We can't use the later-branch solution of a CTE for compatibility reasons (cf commit `5d16332e9`), and it doesn't seem worth inventing some other way to do the query. (I did check, by dint of changing the Asserts to elog(WARNING), that there are no other cases of unwanted OID assignments during 9.4's regression test run.) Discussion: https://postgr.es/m/19785.1497215827@sss.pgh.pa.us	2017-06-12 20:04:33 -04:00
Andrew Dunstan	3c017a545f	Take PROVE_FLAGS from the command line but not the environment This reverts commit `56b6ef893f` and instead makes vcregress.pl parse out PROVE_FLAGS from a command line argument when doing a TAP test, thus making it consistent with the makefile treatment. Discussion: https://postgr.es/m/c26a7416-2fb9-34ab-7991-618c922f896e%402ndquadrant.com Backpatch to 9.4 like previous patch.	2017-06-10 10:22:14 -04:00
Heikki Linnakangas	f44c609ea0	Clear auth context correctly when re-connecting after failed auth attempt. If authentication over an SSL connection fails, with sslmode=prefer, libpq will reconnect without SSL and retry. However, we did not clear the variables related to GSS, SSPI, and SASL authentication state, when reconnecting. Because of that, the second authentication attempt would always fail with a "duplicate GSS/SASL authentication request" error. pg_SSPI_startup did not check for duplicate authentication requests like the corresponding GSS and SASL functions, so with SSPI, you would leak some memory instead. Another way this could manifest itself, on version 10, is if you list multiple hostnames in the "host" parameter. If the first server requests Kerberos or SCRAM authentication, but it fails, the attempts to connect to the other servers will also fail with "duplicate authentication request" errors. To fix, move the clearing of authentication state from closePGconn to pgDropConnection, so that it is cleared also when re-connecting. Patch by Michael Paquier, with some kibitzing by me. Backpatch down to 9.3. 9.2 has the same bug, but the code around closing the connection is somewhat different, so that this patch doesn't apply. To fix this in 9.2, I think we would need to back-port commit `210eb9b743` first, and then apply this patch. However, given that we only bumped into this in our own testing, we haven't heard any reports from users about this, and that 9.2 will be end-of-lifed in a couple of months anyway, it doesn't seem worth the risk and trouble. Discussion: https://www.postgresql.org/message-id/CAB7nPqRuOUm0MyJaUy9L3eXYJU3AKCZ-0-03=-aDTZJGV4GyWw@mail.gmail.com	2017-06-07 14:04:54 +03:00
Andres Freund	b8bd32a51f	Unify SIGHUP handling between normal and walsender backends. Because walsender and normal backends share the same main loop it's problematic to have two different flag variables, set in signal handlers, indicating a pending configuration reload. Only certain walsender commands reach code paths checking for the variable (START_[LOGICAL_]REPLICATION, CREATE_REPLICATION_SLOT ... LOGICAL, notably not base backups). This is a bug present since the introduction of walsender, but has gotten worse in releases since then which allow walsender to do more. A later patch, not slated for v10, will similarly unify SIGHUP handling in other types of processes as well. Author: Petr Jelinek, Andres Freund Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20170423235941.qosiuoyqprq4nu7v@alap3.anarazel.de Backpatch: 9.2-, bug is present since 9.0	2017-06-05 19:18:16 -07:00
Andres Freund	862204aace	Prevent possibility of panics during shutdown checkpoint. When the checkpointer writes the shutdown checkpoint, it checks afterwards whether any WAL has been written since it started and throws a PANIC if so. At that point, only walsenders are still active, so one might think this could not happen, but walsenders can also generate WAL, for instance in BASE_BACKUP and logical decoding related commands (e.g. via hint bits). So they can trigger this panic if such a command is run while the shutdown checkpoint is being written. To fix this, divide the walsender shutdown into two phases. First, checkpointer, itself triggered by postmaster, sends a PROCSIG_WALSND_INIT_STOPPING signal to all walsenders. If the backend is idle or runs an SQL query this causes the backend to shutdown, if logical replication is in progress all existing WAL records are processed followed by a shutdown. Otherwise this causes the walsender to switch to the "stopping" state. In this state, the walsender will reject any further replication commands. The checkpointer begins the shutdown checkpoint once all walsenders are confirmed as stopping. When the shutdown checkpoint finishes, the postmaster sends us SIGUSR2. This instructs walsender to send any outstanding WAL, including the shutdown checkpoint record, wait for it to be replicated to the standby, and then exit. Author: Andres Freund, based on an earlier patch by Michael Paquier Reported-By: Fujii Masao, Andres Freund Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20170602002912.tqlwn4gymzlxpvs2@alap3.anarazel.de Backpatch: 9.4, where logical decoding was introduced	2017-06-05 19:18:16 -07:00
Andres Freund	b3d5b6833f	Have walsenders participate in procsignal infrastructure. The non-participation in procsignal was a problem for both changes in master, e.g. parallelism not working for normal statements run in walsender backends, and older branches, e.g. recovery conflicts and catchup interrupts not working for logical decoding walsenders. This commit thus replaces the previous WalSndXLogSendHandler with procsignal_sigusr1_handler. In branches since `db0f6cad48` that can lead to additional SetLatch calls, but that only rarely seems to make a difference. Author: Andres Freund Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20170421014030.fdzvvvbrz4nckrow@alap3.anarazel.de Backpatch: 9.4, earlier commits don't seem to benefit sufficiently	2017-06-05 19:18:16 -07:00
Andrew Dunstan	ec504aff74	Fix thinko in previous openssl change	2017-06-05 20:39:53 -04:00
Andres Freund	d3ca4b4b45	Fix record length computation in pg_waldump/xlogdump. The current method of computing the record length (excluding the lenght of full-page images) has been wrong since the WAL format has been revamped in `2c03216d83`. Only the main record's length was counted, but that can be significantly too little if there's data associated with further blocks. Fix by computing the record length as total_lenght - fpi_length. Reported-By: Chen Huajun Bug: #14687 Reviewed-By: Heikki Linnakangas Discussion: https://postgr.es/m/20170603165939.1436.58887@wrigleys.postgresql.org Backpatch: 9.5-	2017-06-05 16:10:07 -07:00
Andrew Dunstan	b64ff9c5af	Find openssl lib files in right directory for MSVC Some openssl builds put their lib files in a VC subdirectory, others do not. Cater for both cases. Backpatch to all live branches. From an offline discussion with Leonardo Cecchi.	2017-06-05 14:27:59 -04:00
Alvaro Herrera	55cd9a8ff6	Assorted translatable string fixes Mark our rusage reportage string translatable; remove quotes from type names; unify formatting of very similar messages.	2017-06-04 11:41:16 -04:00
Andres Freund	8a7cd781ee	Allow parallelism in COPY (query) TO ...; Previously this was not allowed, as copy.c didn't set the CURSOR_OPT_PARALLEL_OK flag when planning the query. Set it. While the lack of parallel query for COPY isn't strictly speaking a bug, it does prevent parallelism from being used in a facility commonly used to run long running queries. Thus backpatch to 9.6. Author: Andres Freund Discussion: https://postgr.es/m/20170531231958.ihanapplorptykzm@alap3.anarazel.de Backpatch: 9.6, where parallelism was introduced.	2017-06-02 19:11:23 -07:00
Tom Lane	8d9b4fe01b	Always use -fPIC, not -fpic, when building shared libraries with gcc. On some platforms, -fpic fails for sufficiently large shared libraries. We've mostly not hit that boundary yet, but there are some extensions such as Citus and pglogical where it's becoming a problem. A bit of research suggests that the penalty for -fPIC is small, in the single-digit-percentage range --- and there's none at all on popular platforms such as x86_64. So let's just default to -fPIC everywhere and provide one less thing for extension developers to worry about. Per complaint from Christoph Berg. Back-patch to all supported branches. (I did not bother to touch the recently-removed Makefiles for sco and unixware in the back branches, though. We'd have no way to test that it doesn't break anything on those platforms.) Discussion: https://postgr.es/m/20170529155850.qojdfrwkkqnjb3ap@msg.df7cb.de	2017-06-01 13:32:56 -04:00
Alvaro Herrera	1849b72169	Fix wording in amvalidate error messages Remove some gratuituous message differences by making the AM name previously embedded in each message be a %s instead. While at it, get rid of terminology that's unclear and unnecessary in one message. Discussion: https://postgr.es/m/20170523001557.bq2hbq7hxyvyw62q@alvherre.pgsql	2017-05-30 15:45:42 -04:00
Tom Lane	34782a348d	Try to ensure that stats collector's receive buffer size is at least 100KB. Back-patch of commit `8b0b6303e9`. Discussion: https://postgr.es/m/22173.1494788088@sss.pgh.pa.us	2017-05-29 20:27:45 -04:00
Tom Lane	98bff29074	Prevent running pg_resetwal/pg_resetxlog against wrong-version data dirs. pg_resetwal (formerly pg_resetxlog) doesn't insist on finding a matching version number in pg_control, and that seems like an important thing to preserve since recovering from corrupt pg_control is a prime reason to need to run it. However, that means you can try to run it against a data directory of a different major version, which is at best useless and at worst disastrous. So as to provide some protection against that type of pilot error, inspect PG_VERSION at startup and refuse to do anything if it doesn't match. PG_VERSION is read-only after initdb, so it's unlikely to get corrupted, and even if it were corrupted it would be easy to fix by hand. This hazard has been there all along, so back-patch to all supported branches. Michael Paquier, with some kibitzing by me Discussion: https://postgr.es/m/f4b8eb91-b934-8a0d-b3cc-68f06e2279d1@enterprisedb.com	2017-05-29 17:08:16 -04:00
Tom Lane	dd1daa03bc	Allow NumericOnly to be "+ FCONST". The NumericOnly grammar production accepted ICONST, + ICONST, - ICONST, FCONST, and - FCONST, but for some reason not + FCONST. This led to strange inconsistencies like regression=# set random_page_cost = +4; SET regression=# set random_page_cost = 4000000000; SET regression=# set random_page_cost = +4000000000; ERROR: syntax error at or near "4000000000" (because 4000000000 is too large to be an ICONST). While there's no actual functional reason to need to write a "+", if we allow it for integers it seems like we should allow it for numerics too. It's been like that forever, so back-patch to all supported branches. Discussion: https://postgr.es/m/30908.1496006184@sss.pgh.pa.us	2017-05-29 15:19:07 -04:00
Tom Lane	acab87ece1	Move autogenerated array types out of the way during ALTER ... RENAME. Commit `9aa3c782c` added code to allow CREATE TABLE/CREATE TYPE to not fail when the desired type name conflicts with an autogenerated array type, by dint of renaming the array type out of the way. But I (tgl) overlooked that the same case arises in ALTER TABLE/TYPE RENAME. Fix that too. Back-patch to all supported branches. Report and patch by Vik Fearing, modified a bit by me Discussion: https://postgr.es/m/0f4ade49-4f0b-a9a3-c120-7589f01d1eb8@2ndquadrant.com	2017-05-26 15:16:59 -04:00
Tom Lane	5886c7d589	Fix pg_dump to not emit invalid SQL for an empty operator class. If an operator class has no operators or functions, and doesn't need a STORAGE clause, we emitted "CREATE OPERATOR CLASS ... AS ;" which is syntactically invalid. Fix by forcing a STORAGE clause to be emitted anyway in this case. (At some point we might consider changing the grammar to allow CREATE OPERATOR CLASS without an opclass_item_list. But probably we'd want to omit the AS in that case, so that wouldn't fix this pg_dump issue anyway.) It's been like this all along, so back-patch to all supported branches. Daniel Gustafsson, tweaked by me to avoid a dangling-pointer bug Discussion: https://postgr.es/m/D9E5FC64-7A37-4F3D-B946-7E4FB468F88A@yesql.se	2017-05-26 12:51:05 -04:00
Tom Lane	8527132e52	Tighten checks for whitespace in functions that parse identifiers etc. This patch replaces isspace() calls with scanner_isspace() in functions that are likely to be presented with non-ASCII input. isspace() has the small advantage that it will correctly recognize no-break space in single-byte encodings (such as LATIN1); but it cannot work successfully for any multibyte character, and depending on platform it might return false positive results for some fragments of multibyte characters. That's disastrous for functions that are trying to discard whitespace between valid strings, as noted in bug #14662 from Justin Muise. Even treating no-break space as whitespace is pretty questionable for the usages touched here, because the core scanner would think it is an identifier character. Affected functions are parse_ident(), parseNameAndArgTypes (underlying regprocedurein() and siblings), SplitIdentifierString (used for parsing GUCs and options that are qualified names or lists of names), and SplitDirectoriesString (used for parsing GUCs that are lists of directories). All the functions adjusted here are parsing SQL identifiers and similar constructs, so it's reasonable to insist that their definition of whitespace match the core scanner. So we can hope that this won't cause many backwards-compatibility problems. I've left alone isspace() calls in places that aren't really expecting any non-ASCII input characters, such as float8in(). Back-patch to all supported branches. Discussion: https://postgr.es/m/10129.1495302480@sss.pgh.pa.us	2017-05-24 15:28:34 -04:00
Magnus Hagander	d8ba357db1	Update URLs in pgindent source and README Website and buildfarm is https, not http, and the ftp protocol will be shut down shortly.	2017-05-23 13:59:48 -04:00
Tom Lane	c101d83a3d	Fix precision and rounding issues in money multiplication and division. The cash_div_intX functions applied rint() to the result of the division. That's not merely useless (because the result is already an integer) but it causes precision loss for values larger than 2^52 or so, because of the forced conversion to float8. On the other hand, the cash_mul_fltX functions neglected to apply rint() to their multiplication results, thus possibly causing off-by-one outputs. Per C standard, arithmetic between any integral value and a float value is performed in float format. Thus, cash_mul_flt4 and cash_div_flt4 produced answers good to only about six digits, even when the float value is exact. We can improve matters noticeably by widening the float inputs to double. (It's tempting to consider using "long double" arithmetic if available, but that's probably too much of a stretch for a back-patched fix.) Also, document that cash_div_intX operators truncate rather than round. Per bug #14663 from Richard Pistole. Back-patch to all supported branches. Discussion: https://postgr.es/m/22403.1495223615@sss.pgh.pa.us	2017-05-21 13:05:17 -04:00
Heikki Linnakangas	dbf71771be	Fix typo in comment. Daniel Gustafsson	2017-05-18 10:33:46 +03:00
Tom Lane	bee9e86984	Make psql handle EOF during COPY FROM STDIN properly on all platforms. When stdin is a terminal, it's possible to end a COPY FROM STDIN with a keyboard EOF signal (typically control-D), and then keep on issuing SQL commands. One would expect another COPY FROM STDIN to work as well, but on some platforms it did not. This turns out to be because we were not resetting the stream's feof() flag, and BSD-ish versions of fread() and fgets() won't attempt to read more data if that's set. The misbehavior is observed on BSDen (including macOS), but not Linux, Windows, or SysV-ish Unixen, which makes this a portability bug not just a missing feature. Add a clearerr() call to fix the behavior, and improve the prompt that's issued when copying from a TTY to mention that EOF signals work. It's been like this forever, so back-patch to all supported branches. Thomas Munro Discussion: https://postgr.es/m/CAEepm=0MCGfYf=JAMiYhO6JPtv9-3ZfBo8fcGeCZ8oMzaw+Z+Q@mail.gmail.com	2017-05-17 12:24:19 -04:00
Peter Eisentraut	47b4913050	Fix new warnings from GCC 7 This addresses the new warning types -Wformat-truncation -Wformat-overflow that are part of -Wall, via -Wformat, in GCC 7.	2017-05-16 08:52:39 -04:00
Tom Lane	21c2c3d75e	In SSL tests, don't scribble on permissions of a repo file. Modifying the permissions of a persistent file isn't really much nicer than modifying its contents, even if git doesn't currently notice it. Adjust the test script to make a copy and set the permissions of that instead. Michael Paquier, per a gripe from me. Back-patch to 9.5 where these tests were introduced. Discussion: https://postgr.es/m/14836.1494885946@sss.pgh.pa.us	2017-05-15 23:27:51 -04:00
Tom Lane	b35cce914c	Fix unsafe reference into relcache in constructed CommentStmt. The CommentStmt made by RebuildConstraintComment() has to pstrdup the relation name, else it will contain a dangling pointer after that relcache entry is flushed. (I'm less sure that pstrdup'ing conname is necessary, but let's be safe.) Failure to do this leads to weird errors or crashes, as reported by Marko Elezovic. Bug introduced by commit `e42375fc8`, so back-patch to 9.5 as that was. Fix by David Rowley, regression test by Michael Paquier Discussion: https://postgr.es/m/DB6PR03MB30775D58E732D4EB0C13725B9AE00@DB6PR03MB3077.eurprd03.prod.outlook.com	2017-05-15 11:33:44 -04:00
Tom Lane	c4118d3b9b	Make stats regression test more robust in the face of parallel query. Commit `60690a6fe` attempted to fix the wait_for_stats() function in this test so that it would wait properly if the tenk2 scans were done in parallel workers instead of the main session (typically as a consequence of force_parallel_mode being turned on). However, we made it test for whether the main session's actions had been reported by looking for inserts on 'trunc_stats_test'. This is the Wrong Thing, because those aren't the last updates we expect the main session to do. As shown by recent failures on buildfarm member frogmouth, it's entirely likely that the trunc_stats_test updates will be reported in a separate message from later updates, which means there can be a window in which wait_for_stats() will exit but not all the updates we are expecting to see will have arrived. We should test for the last updates we're expecting, namely those on 'trunc_stats_test4'. Unfortunately, I doubt that this explains frogmouth's failures, because there's no reason to believe that it's running the tenk2 queries in parallel. Still, the test is wrong on its own terms, so fix and back-patch to 9.6 where parallel query came in.	2017-05-14 21:39:10 -04:00
Andres Freund	bd619fcfe0	Avoid superfluous work for commits during logical slot creation. Before `955a684e04` logical decoding snapshot maintenance needed to cope with transactions it might not have seen in their entirety. For such transactions we'd to assume they modified the catalog (could have happened before we were watching), and thus a new snapshot had to be built, and distributed to concurrently running transactions. That's problematic because building a new snapshot isn't that cheap , especially as the the array of committed transactions needs to be sorted. When creating a slot on a server with a lot of transactions, this could make logical slot creation infeasibly expensive. After `955a684e04` there's no need to deal with transaction that aren't guaranteed to be fully observable. That allows to avoid building snapshots for transactions that haven't modified catalog, even before reaching consistency. While this isn't necessarily a bugfix, slot creation being impossible in some production workloads, is severe enough to warrant backpatching. Author: Andres Freund, based on a quite different patch from Petr Jelinek Analyzed-By: Petr Jelinek Reviewed-By: Petr Jelinek Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com Backpatch: 9.4-, where logical decoding has been introduced	2017-05-13 15:06:40 -07:00
Andres Freund	75784859cd	Fix race condition leading to hanging logical slot creation. The snapshot assembly during the creation of logical slots relied waiting for transactions in xl_running_xacts to end, by checking for their commit/abort records. Unfortunately, despite locking, it is possible to see an xl_running_xact record listing transactions as ready, that have already WAL-logged an commit/abort record, as the locking just prevents the ProcArray to be adjusted, and the commit record has to be logged first. That lead to either delayed or hanging snapshot creation, because snapbuild.c would wait "forever" to see commit/abort records for some transactions. That hang resolved only if a xl_running_xacts record without any running transactions happened to be logged, far from certain on a busy server. It's impractical to prevent that via more heavyweight locking, the likelihood of deadlocks and significantly increased contention would be too big. Instead change the initial snapshot creation to be solely based on tracking the oldest running transaction via xl_running_xacts->oldestRunningXid - that actually ends up significantly simplifying the code. That has two disadvantages: 1) Because we cannot fully "trust" the contents of xl_running_xacts, we cannot use it to build the initial snapshot. Instead we have to wait twice for all running transactions to finish. 2) Previously a slot, unless the race occurred, could be created when the all transaction perceived as running based on commit/abort records, now we have to wait for the next xl_running_xacts record. To address that, trigger logging new xl_running_xacts record from within snapbuild.c exactly when necessary. Unfortunately snabuild.c's SnapBuild is stored on disk, one of the stupider ideas of a certain Mr Freund, so we can't change it in a minor release. As this is going to be backpatched, we have to hack around a bit to keep on-disk compatibility. A later commit will rejigger that on master. Author: Andres Freund, based on a quite different patch from Petr Jelinek Analyzed-By: Petr Jelinek Reviewed-By: Petr Jelinek Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com Backpatch: 9.4-, where logical decoding has been introduced	2017-05-13 14:21:00 -07:00
Tom Lane	d0755dc32a	Avoid searching for callback functions in CallSyscacheCallbacks(). We have now grown enough registerable syscache-invalidation callback functions that the original assumption that there would be few of them is causing performance problems. In particular, let's fix things so that CallSyscacheCallbacks doesn't have to search the whole array to find which callback(s) to invoke for a given cache ID. Preserve the original behavior that callbacks are called in order of registration, just in case there's someplace that depends on that (which I doubt). In support of this, export the number of syscaches from syscache.h. People could have found that out anyway from the enum, but adding a #define makes that much safer. This provides a useful additional speedup in Mathieu Fenniak's logical-decoding test case, although we're reaching the point of diminishing returns there. I think any further improvement will have to come from reducing the number of cache invalidations that are triggered in the first place. Still, we can hope that this change gives some incremental benefit for all invalidation scenarios. Back-patch to 9.4 where logical decoding was introduced. Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com	2017-05-12 19:05:28 -04:00
Tom Lane	f7d0a31ce9	Reduce initial size of RelfilenodeMapHash. A test case provided by Mathieu Fenniak shows that hash_seq_search'ing this hashtable can consume a very significant amount of overhead during logical decoding, which triggers frequent cache invalidation. Testing suggests that the actual population of the hashtable is often no more than a few dozen entries, so we can cut the overhead just by dropping the initial number of buckets down from 1024 --- I chose to cut it to 64. (In situations where we do have a significant number of entries, we shouldn't get any real penalty from doing this, as the dynahash.c code will resize the hashtable automatically.) This gives a further factor-of-two savings in Mathieu's test case. That may be overly optimistic for real-world benefit, as real cases may have larger average table populations, but it's hard to see it turning into a net negative for any workload. Back-patch to 9.4 where relfilenodemap.c was introduced. Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com	2017-05-12 18:30:02 -04:00
Tom Lane	b7a98b14c3	Avoid searching for the target catcache in CatalogCacheIdInvalidate. A test case provided by Mathieu Fenniak shows that the initial search for the target catcache in CatalogCacheIdInvalidate consumes a very significant amount of overhead in cases where cache invalidation is triggered but has little useful work to do. There is no good reason for that search to exist at all, as the index array maintained by syscache.c allows direct lookup of the catcache from its ID. We just need a frontend function in syscache.c, matching the division of labor for most other cache-accessing operations. While there's more that can be done in this area, this patch alone reduces the runtime of Mathieu's example by 2X. We can hope that it offers some useful benefit in other cases too, although usually cache invalidation overhead is not such a striking fraction of the total runtime. Back-patch to 9.4 where logical decoding was introduced. It might be worth going further back, but presently the only case we know of where cache invalidation is really a significant burden is in logical decoding. Also, older branches have fewer catcaches, reducing the possible benefit. (Note: although this nominally changes catcache's API, we have always documented CatalogCacheIdInvalidate as a private function, so I would have little sympathy for an external module calling it directly. So backpatching should be fine.) Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com	2017-05-12 18:17:29 -04:00
Andrew Dunstan	cb701af784	Honor PROVE_FLAGS environment setting On MSVC builds and on back branches that means removing the hardcoded --verbose setting. On master for Unix that means removing the empty setting in the global Makefile so that the value can be acquired from the environment as well as from the make arguments. Backpatch to 9.4 where we introduced TAP tests	2017-05-12 11:21:20 -04:00
Andrew Dunstan	69ce3ead1e	Add libxml2 include path for MSVC builds On Unix this path is detected via the use of xml2-config, but that's not available on Windows. This means that users building with libxml2 will no longer need to move things around from the standard libxml2 installation for MSVC builds. Backpatch to all live branches.	2017-05-12 10:22:20 -04:00
Tom Lane	128e133cc6	Increase MAX_SYSCACHE_CALLBACKS to provide more room for extensions. Increase from the historical value of 32 to 64. We are up to 31 callers of CacheRegisterSyscacheCallback() in HEAD, so if they were all to be exercised in one process that would leave only one slot for add-on modules. It's probably not possible for that to happen, but still we clearly need more daylight here. (At some point it might be worth making the array dynamically resizable; but since we've never heard a complaint of "out of syscache_callback_list slots" happening in the field, I doubt it's worth it yet.) Back-patch as far as 9.4, which is where we increased the companion limit MAX_RELCACHE_CALLBACKS (cf commit `f01d1ae3a`). It's not as urgent in released branches, which have only a couple dozen call sites in core, but it still seems that somebody might hit the limit before these branches die. Discussion: https://postgr.es/m/12184.1494450131@sss.pgh.pa.us	2017-05-11 14:51:33 -04:00
Peter Eisentraut	7f335e780c	psql: Add missing translation markers	2017-05-10 10:15:36 -04:00
Alvaro Herrera	bfaba24829	Ignore PQcancel errors properly Add a (void) cast to all PQcancel() calls that purposefully don't check the return value, to keep compilers and static checkers happy. Per Coverity.	2017-05-09 14:58:51 -03:00
Tom Lane	ca9cfed883	Stamp 9.6.3.	2017-05-08 17:15:12 -04:00
Tom Lane	935e77d527	Further patch rangetypes_selfuncs.c's statistics slot management. Values in a STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM slot are float8, not of the type of the column the statistics are for. This bug is at least partly the fault of sloppy specification comments for get_attstatsslot()/free_attstatsslot(): the type OID they want is that of the stavalues entries, not of the underlying column. (I double-checked other callers and they seem to get this right.) Adjust the comments to be more correct. Per buildfarm. Security: CVE-2017-7484	2017-05-08 15:02:57 -04:00
Tom Lane	cad1594322	Fix possibly-uninitialized variable. Oversight in `e2d4ef8de` et al (my fault not Peter's). Per buildfarm. Security: CVE-2017-7484	2017-05-08 11:18:54 -04:00
Noah Misch	c928addfcc	Match pg_user_mappings limits to information_schema.user_mapping_options. Both views replace the umoptions field with NULL when the user does not meet qualifications to see it. They used different qualifications, and pg_user_mappings documented qualifications did not match its implemented qualifications. Make its documentation and implementation match those of user_mapping_options. One might argue for stronger qualifications, but these have long, documented tenure. pg_user_mappings has always exhibited this problem, so back-patch to 9.2 (all supported versions). Michael Paquier and Feike Steenbergen. Reviewed by Jeff Janes. Reported by Andrew Wheelwright. Security: CVE-2017-7486	2017-05-08 07:24:27 -07:00
Noah Misch	aafbd1df96	Restore PGREQUIRESSL recognition in libpq. Commit `65c3bf19fd` moved handling of the, already then, deprecated requiressl parameter into conninfo_storeval(). The default PGREQUIRESSL environment variable was however lost in the change resulting in a potentially silent accept of a non-SSL connection even when set. Its documentation remained. Restore its implementation. Also amend the documentation to mark PGREQUIRESSL as deprecated for those not following the link to requiressl. Back-patch to 9.3, where commit `65c3bf1` first appeared. Behavior has been more complex when the user provides both deprecated and non-deprecated settings. Before commit `65c3bf1`, libpq operated according to the first of these found: requiressl=1 PGREQUIRESSL=1 sslmode=* PGSSLMODE=* (Note requiressl=0 didn't override sslmode=; it would only suppress PGREQUIRESSL=1 or a previous requiressl=1. PGREQUIRESSL=0 had no effect whatsoever.) Starting with commit `65c3bf1`, libpq ignored PGREQUIRESSL, and order of precedence changed to this: last of requiressl= or sslmode=* PGSSLMODE=* Starting now, adopt the following order of precedence: last of requiressl=* or sslmode=* PGSSLMODE=* PGREQUIRESSL=1 This retains the `65c3bf1` behavior for connection strings that contain both requiressl=* and sslmode=*. It retains the `65c3bf1` change that either connection string option overrides both environment variables. For the first time, PGSSLMODE has precedence over PGREQUIRESSL; this avoids reducing security of "PGREQUIRESSL=1 PGSSLMODE=verify-full" configurations originating under v9.3 and later. Daniel Gustafsson Security: CVE-2017-7485	2017-05-08 07:24:27 -07:00
Peter Eisentraut	0294ac2a5d	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: dd6f9ed0d9d7b33d328761dcdd3a70f44aaff6ff	2017-05-08 10:10:54 -04:00
Peter Eisentraut	c33c423622	Add security checks to selectivity estimation functions Some selectivity estimation functions run user-supplied operators over data obtained from pg_statistic without security checks, which allows those operators to leak pg_statistic data without having privileges on the underlying tables. Fix by checking that one of the following is satisfied: (1) the user has table or column privileges on the table underlying the pg_statistic data, or (2) the function implementing the user-supplied operator is leak-proof. If neither is satisfied, planning will proceed as if there are no statistics available. At least one of these is satisfied in most cases in practice. The only situations that are negatively impacted are user-defined or not-leak-proof operators on a security-barrier view. Reported-by: Robert Haas <robertmhaas@gmail.com> Author: Peter Eisentraut <peter_e@gmx.net> Author: Tom Lane <tgl@sss.pgh.pa.us> Security: CVE-2017-7484	2017-05-08 09:18:57 -04:00
Tom Lane	fab2d0d7f4	Guard against null t->tm_zone in strftime.c. The upstream IANA code does not guard against null TM_ZONE pointers in this function, but in our code there is such a check in the other pre-existing use of t->tm_zone. We do have some places that set pg_tm.tm_zone to NULL. I'm not entirely sure it's possible to reach strftime with such a value, but I'm not sure it isn't either, so be safe. Per Coverity complaint.	2017-05-07 12:33:18 -04:00
Tom Lane	f754728170	Install the "posixrules" timezone link in MSVC builds. Somehow, we'd missed ever doing this. The consequences aren't too severe: basically, the timezone library would fall back on its hardwired notion of the DST transition dates to use for a POSIX-style zone name, rather than obeying US/Eastern which is the intended behavior. The net effect would only be to obey current US DST law further back than it ought to apply; so it's not real surprising that nobody noticed. David Rowley, per report from Amit Kapila Discussion: https://postgr.es/m/CAA4eK1LC7CaNhRAQ__C3ht1JVrPzaAXXhEJRnR5L6bfYHiLmWw@mail.gmail.com	2017-05-07 11:57:41 -04:00
Tom Lane	5042e9492b	Restore fullname[] contents before falling through in pg_open_tzfile(). Fix oversight in commit `af2c5aa88`: if the shortcut open() doesn't work, we need to reset fullname[] to be just the name of the toplevel tzdata directory before we fall through into the pre-existing code. This failed to be exposed in my (tgl's) testing because the fall-through path is actually never taken under normal circumstances. David Rowley, per report from Amit Kapila Discussion: https://postgr.es/m/CAA4eK1LC7CaNhRAQ__C3ht1JVrPzaAXXhEJRnR5L6bfYHiLmWw@mail.gmail.com	2017-05-07 11:34:41 -04:00
Stephen Frost	ef42c11037	pg_dump: Don't leak memory in buildDefaultACLCommands() buildDefaultACLCommands() didn't destroy the string buffer created in certain cases, leading to a memory leak. Fix by destroying the buffer before returning from the function. Spotted by Coverity. Author: Michael Paquier Back-patch to 9.6 where buildDefaultACLCommands() was added.	2017-05-06 22:58:22 -04:00
Stephen Frost	92b15224b4	RLS: Fix ALL vs. SELECT+UPDATE policy usage When we add the SELECT-privilege based policies to the RLS with check options (such as for an UPDATE statement, or when we have INSERT ... RETURNING), we need to be sure and use the 'USING' case if the policy is actually an 'ALL' policy (which could have both a USING clause and an independent WITH CHECK clause). This could result in policies acting differently when built using ALL (when the ALL had both USING and WITH CHECK clauses) and when building the policies independently as SELECT and UPDATE policies. Fix this by adding an explicit boolean to add_with_check_options() to indicate when the USING policy should be used, even if the policy has both USING and WITH CHECK policies on it. Reported by: Rod Taylor Back-patch to 9.5 where RLS was introduced.	2017-05-06 21:46:41 -04:00
Alvaro Herrera	19a4033785	Allow MSVC to build with Tcl 8.6. Commit `eaba54c20c` added support for Tcl 8.6 for configure-supported platforms after verifying that pltcl works without further changes, but the MSVC tooling wasn't updated accordingly. Update MSVC to match, restructuring the code to avoid duplicating the logic for every Tcl version supported. Backpatch to all live branches, like `eaba54c20c`. In 9.4 and previous, change the patch to use backslashes rather than forward, as in the rest of the file. Reported by Paresh More, who also tested the patch I provided. Discussion: https://postgr.es/m/CAAgiCNGVw3ssBtSi3ZNstrz5k00ax=UV+_ZEHUeW_LMSGL2sew@mail.gmail.com	2017-05-05 12:05:34 -03:00
Heikki Linnakangas	bd05ad8b05	Give nicer error message when connecting to a v10 server requiring SCRAM. This is just to give the user a hint that they need to upgrade, if they try to connect to a v10 server that uses SCRAM authentication, with an older client. Commit to all stable branches, but not master. Discussion: https://www.postgresql.org/message-id/bbf45d92-3896-eeb7-7399-2111d517261b@pivotal.io	2017-05-05 11:24:02 +03:00
Peter Eisentraut	071d13395c	Fix cursor_to_xml in tableforest false mode It only produced <row> elements but no wrapping <table> element. By contrast, cursor_to_xmlschema produced a schema that is now correct but did not previously match the XML data produced by cursor_to_xml. In passing, also fix a minor misunderstanding about moving cursors in the tests related to this. Reported-by: filip@jirsak.org Based-on-patch-by: Thomas Munro <thomas.munro@enterprisedb.com>	2017-05-04 21:17:46 -04:00
Tom Lane	855f0e9247	Fix pfree-of-already-freed-tuple when rescanning a GiST index-only scan. GiST's getNextNearest() function attempts to pfree the previously-returned tuple if any (that is, scan->xs_hitup in HEAD, or scan->xs_itup in older branches). However, if we are rescanning a plan node after ending a previous scan early, those tuple pointers could be pointing to garbage, because they would be pointing into the scan's pageDataCxt or queueCxt which has been reset. In a debug build this reliably results in a crash, although I think it might sometimes accidentally fail to fail in production builds. To fix, clear the pointer field anyplace we reset a context it might be pointing into. This may be overkill --- I think probably only the queueCxt case is involved in this bug, so that resetting in gistrescan() would be sufficient --- but dangling pointers are generally bad news, so let's avoid them. Another plausible answer might be to just not bother with the pfree in getNextNearest(). The reconstructed tuples would go away anyway in the context resets, and I'm far from convinced that freeing them a bit earlier really saves anything meaningful. I'll stick with the original logic in this patch, but if we find more problems in the same area we should consider that approach. Per bug #14641 from Denis Smirnov. Back-patch to 9.5 where this logic was introduced. Discussion: https://postgr.es/m/20170504072034.24366.57688@wrigleys.postgresql.org	2017-05-04 13:59:13 -04:00
Tom Lane	56064d5512	Remove useless and rather expensive stanza in matview regression test. This removes a test case added by commit `b69ec7cc9`, which was intended to exercise a corner case involving the rule used at that time that materialized views were unpopulated iff they had physical size zero. We got rid of that rule very shortly later, in commit `1d6c72a55`, but kept the test case. However, because the case now asks what VACUUM will do to a zero-sized physical file, it would be pretty surprising if the answer were ever anything but "nothing" ... and if things were indeed that broken, surely we'd find it out from other tests. Since the test involves a table that's fairly large by regression-test standards (100K rows), it's quite slow to run. Dropping it should save some buildfarm cycles, so let's do that. Discussion: https://postgr.es/m/32386.1493831320@sss.pgh.pa.us	2017-05-03 19:37:01 -04:00
Tom Lane	c521d5a8d7	Improve performance of timezone loading, especially pg_timezone_names view. tzparse() would attempt to load the "posixrules" timezone database file on each call. That might seem like it would only be an issue when selecting a POSIX-style zone name rather than a zone defined in the timezone database, but it turns out that each zone definition file contains a POSIX-style zone string and tzload() will call tzparse() to parse that. Thus, when scanning the whole timezone file tree as we do in the pg_timezone_names view, "posixrules" was read repetitively for each zone definition file. Fix that by caching the file on first use within any given process. (We cache other zone definitions for the life of the process, so there seems little reason not to cache this one as well.) This probably won't help much in processes that never run pg_timezone_names, but even one additional SET of the timezone GUC would come out ahead. An even worse problem for pg_timezone_names is that pg_open_tzfile() has an inefficient way of identifying the canonical case of a zone name: it basically re-descends the directory tree to the zone file. That's not awful for an individual "SET timezone" operation, but it's pretty horrid when we're inspecting every zone in the database. And it's pointless too because we already know the canonical spelling, having just read it from the filesystem. Fix by teaching pg_open_tzfile() to avoid the directory search if it's not asked for the canonical name, and backfilling the proper result in pg_tzenumerate_next(). In combination these changes seem to make the pg_timezone_names view about 3x faster to read, for me. Since a scan of pg_timezone_names has up to now been one of the slowest queries in the regression tests, this should help some little bit for buildfarm cycle times. Back-patch to all supported branches, not so much because it's likely that users will care much about the view's performance as because tracking changes in the upstream IANA timezone code is really painful if we don't keep all the branches in sync. Discussion: https://postgr.es/m/27962.1493671706@sss.pgh.pa.us	2017-05-02 21:50:42 -04:00
Tom Lane	d56b8b41b3	Ensure commands in extension scripts see the results of preceding DDL. Due to a missing CommandCounterIncrement() call, parsing of a non-utility command in an extension script would not see the effects of the immediately preceding DDL command, unless that command's execution ends with CommandCounterIncrement() internally ... which some do but many don't. Report by Philippe Beaudoin, diagnosis by Julien Rouhaud. Rather remarkably, this bug has evaded detection since extensions were invented, so back-patch to all supported branches. Discussion: https://postgr.es/m/2cf7941e-4e41-7714-3de8-37b1a8f74dff@free.fr	2017-05-02 18:05:54 -04:00
Andrew Dunstan	f06caa09d9	Fix perl thinko in commit `fed6df486d` Report and fix from Vaishnavi Prabakaran Backpatch to 9.4 like original.	2017-05-02 08:22:45 -04:00
Tom Lane	1fdc3f6e8a	Update time zone data files to tzdata release 2017b. DST law changes in Chile, Haiti, and Mongolia. Historical corrections for Ecuador, Kazakhstan, Liberia, and Spain. The IANA crew continue their campaign to replace invented time zone abbrevations with numeric GMT offsets. This update changes numerous zones in South America, the Pacific and Indian oceans, and some Asian and Middle Eastern zones. I kept these abbreviations in the tznames/ data files, however, so that we will still accept them for input. (We may want to start trimming those files someday, but I think we should wait for the upstream dust to settle before deciding what to do.) In passing, add MESZ (Mitteleuropaeische Sommerzeit) to the tznames lists; since we accept MEZ (Mitteleuropaeische Zeit) it seems rather strange not to take the other one. And fix some incorrect, or at least obsolete, comments that certain abbreviations are not traceable to the IANA data.	2017-05-01 11:53:42 -04:00
Andrew Dunstan	66510630d8	Allow vcregress.pl to run an arbitrary TAP test set Currently only provision for running the bin checks in a single step is provided for. Now these tests can be run individually, as well as tests in other locations (e.g. src.test/recover). Also provide for suppressing unnecessary temp installs by setting the NO_TEMP_INSTALL environment variable just as the Makefiles do. Backpatch to 9.4.	2017-05-01 10:52:21 -04:00
Tom Lane	6872d96a3b	Sync our copy of the timezone library with IANA release tzcode2017b. zic no longer mishandles some transitions in January 2038 when it attempts to work around Qt bug 53071. This fixes a bug affecting Pacific/Tongatapu that was introduced in zic 2016e. localtime.c now contains a workaround, useful when loading a file generated by a buggy zic. There are assorted cosmetic changes as well, notably relocation of a bunch of #defines.	2017-04-30 15:14:06 -04:00
Robert Haas	8a9c83bfaf	Fix VALIDATE CONSTRAINT to consider NO INHERIT attribute. Currently, trying to validate a NO INHERIT constraint on the parent will search for the constraint in child tables (where it is not supposed to exist), wrongly causing a "constraint does not exist" error. Amit Langote, per a report from Hans Buschmann. Discussion: http://postgr.es/m/20170421184012.24362.19@wrigleys.postgresql.org	2017-04-28 14:48:44 -04:00
Andres Freund	29e8c881dd	Don't use on-disk snapshots for exported logical decoding snapshot. Logical decoding stores historical snapshots on disk, so that logical decoding can restart without having to reconstruct a snapshot from scratch (for which the resources are not guaranteed to be present anymore). These serialized snapshots were also used when creating a new slot via the walsender interface, which can export a "full" snapshot (i.e. one that can read all tables, not just catalog ones). The problem is that the serialized snapshots are only useful for catalogs and not for normal user tables. Thus the use of such a serialized snapshot could result in an inconsistent snapshot being exported, which could lead to queries returning wrong data. This would only happen if logical slots are created while another logical slot already exists. Author: Petr Jelinek Reviewed-By: Andres Freund Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com Backport: 9.4, where logical decoding was introduced.	2017-04-27 15:29:33 -07:00
Tom Lane	53b1578460	Cope with glibc too old to have epoll_create1(). Commit `fa31b6f4e` supposed that we didn't have to worry about that anymore, but it seems that RHEL5 is like that, and that's still a supported platform. Put back the prior coding under an #ifdef, adding an explicit fcntl() to retain the desired CLOEXEC property. Discussion: https://postgr.es/m/12307.1493325329@sss.pgh.pa.us	2017-04-27 17:13:54 -04:00
Andres Freund	28afff347a	Preserve required !catalog tuples while computing initial decoding snapshot. The logical decoding machinery already preserved all the required catalog tuples, which is sufficient in the course of normal logical decoding, but did not guarantee that non-catalog tuples were preserved during computation of the initial snapshot when creating a slot over the replication protocol. This could cause a corrupted initial snapshot being exported. The time window for issues is usually not terribly large, but on a busy server it's perfectly possible to it hit it. Ongoing decoding is not affected by this bug. To avoid increased overhead for the SQL API, only retain additional tuples when a logical slot is being created over the replication protocol. To do so this commit changes the signature of CreateInitDecodingContext(), but it seems unlikely that it's being used in an extension, so that's probably ok. In a drive-by fix, fix handling of ReplicationSlotsComputeRequiredXmin's already_locked argument, which should only apply to ProcArrayLock, not ReplicationSlotControlLock. Reported-By: Erik Rijkers Analyzed-By: Petr Jelinek Author: Petr Jelinek, heavily editorialized by Andres Freund Reviewed-By: Andres Freund Discussion: https://postgr.es/m/9a897b86-46e1-9915-ee4c-da02e4ff6a95@2ndquadrant.com Backport: 9.4, where logical decoding was introduced.	2017-04-27 13:13:36 -07:00
Tom Lane	866452cd8b	Make latch.c more paranoid about child-process cases. Although the postmaster doesn't currently create a self-pipe or any latches, there's discussion of it doing so in future. It's also conceivable that a shared_preload_libraries extension would try to create such a thing in the postmaster process today. In that case the self-pipe FDs would be inherited by forked child processes. latch.c was entirely unprepared for such a case and could suffer an assertion failure, or worse try to use the inherited pipe if somebody called WaitLatch without having called InitializeLatchSupport in that process. Make it keep track of whether InitializeLatchSupport has been called in the current process, and do the right thing if state has been inherited from a parent. Apply FD_CLOEXEC to file descriptors created in latch.c (the self-pipe, as well as epoll event sets). This ensures that child processes spawned in backends, the archiver, etc cannot accidentally or intentionally mess with these FDs. It also ensures that we end up with the right state for the self-pipe in EXEC_BACKEND processes, which otherwise wouldn't know to close the postmaster's self-pipe FDs. Back-patch to 9.6, mainly to keep latch.c looking similar in all branches it exists in. Discussion: https://postgr.es/m/8322.1493240739@sss.pgh.pa.us	2017-04-27 15:07:36 -04:00
Tom Lane	e880df25ec	Allow multiple bgworkers to be launched per postmaster iteration. Previously, maybe_start_bgworker() would launch at most one bgworker process per call, on the grounds that the postmaster might otherwise neglect its other duties for too long. However, that seems overly conservative, especially since bad effects only become obvious when many hundreds of bgworkers need to be launched at once. On the other side of the coin is that the existing logic could result in substantial delay of bgworker launches, because ServerLoop isn't guaranteed to iterate immediately after a signal arrives. (My attempt to fix that by using pselect(2) encountered too many portability question marks, and in any case could not help on platforms without pselect().) One could also question the wisdom of using an O(N^2) processing method if the system is intended to support so many bgworkers. As a compromise, allow that function to launch up to 100 bgworkers per call (and in consequence, rename it to maybe_start_bgworkers). This will allow any normal parallel-query request for workers to be satisfied immediately during sigusr1_handler, avoiding the question of whether ServerLoop will be able to launch more promptly. There is talk of rewriting the postmaster to use a WaitEventSet to avoid the signal-response-delay problem, but I'd argue that this change should be kept even after that happens (if it ever does). Backpatch to 9.6 where parallel query was added. The issue exists before that, but previous uses of bgworkers typically aren't as sensitive to how quickly they get launched. Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us	2017-04-26 16:17:29 -04:00
Tom Lane	e615605239	Revert "Use pselect(2) not select(2), if available, to wait in postmaster's loop." This reverts commit `b9515b6287`. Buildfarm results suggest that some platforms have versions of pselect(2) that are not merely non-atomic, but flat out non-functional. Revert the use-pselect patch to confirm this diagnosis (and exclude the no-SA_RESTART patch as the source of trouble). If it's so, we should probably look into blacklisting specific platforms that have broken pselect. Discussion: https://postgr.es/m/9696.1493072081@sss.pgh.pa.us	2017-04-24 18:29:55 -04:00
Tom Lane	b9515b6287	Use pselect(2) not select(2), if available, to wait in postmaster's loop. Traditionally we've unblocked signals, called select(2), and then blocked signals again. The code expects that the select() will be cancelled with EINTR if an interrupt occurs; but there's a race condition, which is that an already-pending signal will be delivered as soon as we unblock, and then when we reach select() there will be nothing preventing it from waiting. This can result in a long delay before we perform any action that ServerLoop was supposed to have taken in response to the signal. As with the somewhat-similar symptoms fixed by commit `893902085`, the main practical problem is slow launching of parallel workers. The window for trouble is usually pretty short, corresponding to one iteration of ServerLoop; but it's not negligible. To fix, use pselect(2) in place of select(2) where available, as that's designed to solve exactly this problem. Where not available, we continue to use the old way, and are no worse off than before. pselect(2) has been required by POSIX since about 2001, so most modern platforms should have it. A bigger portability issue is that some implementations are said to be non-atomic, ie pselect() isn't really any different from unblock/select/reblock. Still, we're no worse off than before on such a platform. There is talk of rewriting the postmaster to use a WaitEventSet and not do signal response work in signal handlers, at which point this could be reverted, since we'd be using a self-pipe to solve the race condition. But that's not happening before v11 at the earliest. Back-patch to 9.6. The problem exists much further back, but the worst symptom arises only in connection with parallel query, so it does not seem worth taking any portability risks in older branches. Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us	2017-04-24 14:03:14 -04:00
Tom Lane	dfa4baf91c	Run the postmaster's signal handlers without SA_RESTART. The postmaster keeps signals blocked everywhere except while waiting for something to happen in ServerLoop(). The code expects that the select(2) will be cancelled with EINTR if an interrupt occurs; without that, followup actions that should be performed by ServerLoop() itself will be delayed. However, some platforms interpret the SA_RESTART signal flag as meaning that they should restart rather than cancel the select(2). Worse yet, some of them restart it with the original timeout delay, meaning that a steady stream of signal interrupts can prevent ServerLoop() from iterating at all if there are no incoming connection requests. Observable symptoms of this, on an affected platform such as HPUX 10, include extremely slow parallel query startup (possibly as much as 30 seconds) and failure to update timestamps on the postmaster's sockets and lockfiles when no new connections arrive for a long time. We can fix this by running the postmaster's signal handlers without SA_RESTART. That would be quite a scary change if the range of code where signals are accepted weren't so tiny, but as it is, it seems safe enough. (Note that postmaster children do, and must, reset all the handlers before unblocking signals; so this change should not affect any child process.) There is talk of rewriting the postmaster to use a WaitEventSet and not do signal response work in signal handlers, at which point it might be appropriate to revert this patch. But that's not happening before v11 at the earliest. Back-patch to 9.6. The problem exists much further back, but the worst symptom arises only in connection with parallel query, so it does not seem worth taking any portability risks in older branches. Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us	2017-04-24 13:00:23 -04:00
Tom Lane	63f64d2824	Fix postmaster's handling of fork failure for a bgworker process. This corner case didn't behave nicely at all: the postmaster would (partially) update its state as though the process had started successfully, and be quite confused thereafter. Fix it to act like the worker had crashed, instead. In passing, refactor so that do_start_bgworker contains all the state-change logic for bgworker launch, rather than just some of it. Back-patch as far as 9.4. 9.3 contains similar logic, but it's just enough different that I don't feel comfortable applying the patch without more study; and the use of bgworkers in 9.3 was so small that it doesn't seem worth the extra work. transam/parallel.c is still entirely unprepared for the possibility of bgworker startup failure, but that seems like material for a separate patch. Discussion: https://postgr.es/m/4905.1492813727@sss.pgh.pa.us	2017-04-24 12:16:58 -04:00
Andres Freund	39369b4145	Zero padding in replication origin's checkpointed on disk-state. This seems to be largely cosmetic, avoiding valgrind bleats and the like. The uninitialized padding influences the CRC of the on-disk entry, but because it's also used when verifying the CRC, that doesn't cause spurious failures. Backpatch nonetheless. It's a bit unfortunate that contrib/test_decoding/sql/replorigin.sql doesn't exercise the checkpoint path, but checkpoints are fairly expensive on weaker machines, and we'd have to stop/start for that to be meaningful. Author: Andres Freund Discussion: https://postgr.es/m/20170422183123.w2jgiuxtts7qrqaq@alap3.anarazel.de Backpatch: 9.5, where replication origins were introduced	2017-04-23 15:54:53 -07:00
Tom Lane	f5885488da	Fix order of arguments to SubTransSetParent(). ProcessTwoPhaseBuffer (formerly StandbyRecoverPreparedTransactions) mixed up the parent and child XIDs when calling SubTransSetParent to record the transactions' relationship in pg_subtrans. Remarkably, analysis by Simon Riggs suggests that this doesn't lead to visible problems (at least, not in non-Assert builds). That might explain why we'd not noticed it before. Nonetheless, it's surely wrong. This code was born broken, so back-patch to all supported branches. Discussion: https://postgr.es/m/20110.1492905318@sss.pgh.pa.us	2017-04-23 13:11:08 -04:00
Andrew Dunstan	11927e575d	Fix TAP infrastructure to support Mingw better archive_command and restore_command need to refer to Windows paths, not Msys virtual file system paths, as postgres is completely unaware of the latter, so prefix them with the Windows path to the virtual file system root. Clean psql output of carriage returns.	2017-04-23 09:39:43 -04:00
Tom Lane	9d5f0718d7	Make PostgresNode.pm check server status more carefully. PostgresNode blithely ignored the exit status of pg_ctl, and in general made no effort to be sure that the server was running when it should be. This caused it to miss server crashes, which is a serious shortcoming in a test scaffold. Make it complain if pg_ctl fails, and modify the start and stop logic to complain if the server doesn't start, or doesn't stop, when expected. Also, have it turn off the "restart_after_crash" configuration parameter in created clusters, as bitter experience has shown that leaving that on can mask crashes too. We might at some point need variant functions that allow for, eg, server start failure to be expected. But no existing test case appears to want that, and it surely shouldn't be the default behavior. Note that this will break the buildfarm, as it will expose known bugs that the previous testing failed to. I'm committing it despite that, to verify that we get the expected failures in the buildfarm not just in manual testing. Back-patch into 9.6 where PostgresNode was introduced. (The 9.6 branch is not expected to show any failures.) Discussion: https://postgr.es/m/21432.1492886428@sss.pgh.pa.us	2017-04-22 18:18:25 -04:00
Tom Lane	551cc9af57	Make PostgresNode::append_conf append a newline automatically. Although the documentation for append_conf said clearly that it didn't add a newline, many test authors seem to have forgotten that ... or maybe they just consulted the example at the top of the POD documentation, which clearly shows adding a config entry without bothering to add a trailing newline. The worst part of that is that it works, as long as you don't do it more than once, since the backend isn't picky about whether config files end with newlines. So there's not a strong forcing function reminding test authors not to do it like that. Upshot is that this is a terribly fragile way to go about things, and there's at least one existing test case that is demonstrably broken and not testing what it thinks it is. Let's just make append_conf append a newline, instead; that is clearly way safer than the old definition. I also cleaned up a few call sites that were unnecessarily ugly. (I left things alone in places where it's plausible that additional config lines would need to be added someday.) Back-patch the change in append_conf itself to 9.6 where it was added, as having a definitional inconsistency between branches would obviously be pretty hazardous for back-patching TAP tests. The other changes are just cosmetic and don't need to be back-patched. Discussion: https://postgr.es/m/19751.1492892376@sss.pgh.pa.us	2017-04-22 16:58:15 -04:00
Tom Lane	f07f6b62c9	Avoid depending on non-POSIX behavior of fcntl(2). The POSIX standard does not say that the success return value for fcntl(F_SETFD) and fcntl(F_SETFL) is zero; it says only that it's not -1. We had several calls that were making the stronger assumption. Adjust them to test specifically for -1 for strict spec compliance. The standard further leaves open the possibility that the O_NONBLOCK flag bit is not the only active one in F_SETFL's argument. Formally, therefore, one ought to get the current flags with F_GETFL and store them back with only the O_NONBLOCK bit changed when trying to change the nonblock state. In port/noblock.c, we were doing the full pushup in pg_set_block but not in pg_set_noblock, which is just weird. Make both of them do it properly, since they have little business making any assumptions about the socket they're handed. The other places where we're issuing F_SETFL are working with FDs we just got from pipe(2), so it's reasonable to assume the FDs' properties are all default, so I didn't bother adding F_GETFL steps there. Also, while pg_set_block deserves some points for trying to do things right, somebody had decided that it'd be even better to cast fcntl's third argument to "long". Which is completely loony, because POSIX clearly says the third argument for an F_SETFL call is "int". Given the lack of field complaints, these missteps apparently are not of significance on any common platforms. But they're still wrong, so back-patch to all supported branches. Discussion: https://postgr.es/m/30882.1492800880@sss.pgh.pa.us	2017-04-21 15:55:56 -04:00
Tom Lane	6c73b390b4	Always build a custom plan node's targetlist from the path's pathtarget. We were applying the use_physical_tlist optimization to all relation scan plans, even those implemented by custom scan providers. However, that's a bad idea for a couple of reasons. The custom provider might be unable to provide columns that it hadn't expected to be asked for (for example, the custom scan might depend on an index-only scan). Even more to the point, there's no good reason to suppose that this "optimization" is a win for a custom scan; whatever the custom provider is doing is likely not based on simply returning physical heap tuples. (As a counterexample, if the custom scan is an interface to a column store, demanding all columns would be a huge loss.) If it is a win, the custom provider could make that decision for itself and insert a suitable pathtarget into the path, anyway. Per discussion with Dmitry Ivanov. Back-patch to 9.5 where custom scan support was introduced. The argument that the custom provider can adjust the behavior by changing the pathtarget only applies to 9.6+, but on balance it seems more likely that use_physical_tlist will hurt custom scans than help them. Discussion: https://postgr.es/m/e29ddd30-8ef9-4da5-a50b-2bb7b8c7198d@postgrespro.ru	2017-04-17 15:29:00 -04:00
Peter Eisentraut	1ec36a9eb4	Fix compiler warning Introduced by `41306a511c`, happens with gcc 4.7.2.	2017-04-16 20:49:40 -04:00
Tom Lane	a30f146db4	Provide a way to control SysV shmem attach address in EXEC_BACKEND builds. In standard non-Windows builds, there's no particular reason to care what address the kernel chooses to map the shared memory segment at. However, when building with EXEC_BACKEND, there's a risk that the chosen address won't be available in all child processes. Linux with ASLR enabled (which it is by default) seems particularly at risk because it puts shmem segments into the same area where it maps shared libraries. We can work around that by specifying a mapping address that's outside the range where shared libraries could get mapped. On x86_64 Linux, 0x7e0000000000 seems to work well. This is only meant for testing/debugging purposes, so it doesn't seem necessary to go as far as providing a GUC (or any user-visible documentation, though we might change that later). Instead, it's just controlled by setting an environment variable PG_SHMEM_ADDR to the desired attach address. Back-patch to all supported branches, since the point here is to remove intermittent buildfarm failures on EXEC_BACKEND animals. Owners of affected animals will need to add a suitable setting of PG_SHMEM_ADDR to their build_env configuration. Discussion: https://postgr.es/m/7036.1492231361@sss.pgh.pa.us	2017-04-15 17:27:51 -04:00
Tom Lane	9c225acf0b	Avoid passing function pointers across process boundaries. This back-patches commit `32470825d3` into 9.6, primarily to make buildfarm member culicidae happy. Unlike the HEAD patch, avoid changing the existing API of CreateParallelContext; instead we just switch to using CreateParallelContextForExternalFunction, even for core functions. Petr Jelinek, with a bunch of basically-cosmetic adjustments by me Discussion: https://postgr.es/m/548f9c1d-eafa-e3fa-9da8-f0cc2f654e60@2ndquadrant.com	2017-04-15 16:23:27 -04:00
Tom Lane	a70b18b894	Fix regexport.c to behave sanely with lookaround constraints. regexport.c thought it could just ignore LACON arcs, but the correct behavior is to treat them as satisfiable while consuming zero input (rather reminiscently of commit `9f1e642d5`). Otherwise, the emitted simplified-NFA representation may contain no paths leading from initial to final state, which unsurprisingly confuses pg_trgm, as seen in bug #14623 from Jeff Janes. Since regexport's output representation has no concept of an arc that consumes zero input, recurse internally to find the next normal arc(s) after any LACON transitions. We'd be forced into changing that representation if a LACON could be the last arc reaching the final state, but fortunately the regex library never builds NFAs with such a configuration, so there always is a next normal arc. Back-patch to 9.3 where this logic was introduced. Discussion: https://postgr.es/m/20170413180503.25948.94871@wrigleys.postgresql.org	2017-04-13 17:18:35 -04:00
Tom Lane	be182d5702	Improve castNode notation by introducing list-extraction-specific variants. This extends the castNode() notation introduced by commit `5bcab1114` to provide, in one step, extraction of a list cell's pointer and coercion to a concrete node type. For example, "lfirst_node(Foo, lc)" is the same as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode that have appeared so far include a list extraction call, so this is pretty widely useful, and it saves a few more keystrokes compared to the old way. As with the previous patch, back-patch the addition of these macros to pg_list.h, so that the notation will be available when back-patching. Patch by me, after an idea of Andrew Gierth's. Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us	2017-04-10 13:51:29 -04:00
Tom Lane	c0a493e17c	Fix planner error (or assert trap) with nested set operations. As reported by Sean Johnston in bug #14614, since 9.6 the planner can fail due to trying to look up the referent of a Var with varno 0. This happens because we generate such Vars in generate_append_tlist, for lack of any better way to describe the output of a SetOp node. In typical situations nothing really cares about that, but given nested set-operation queries we will call estimate_num_groups on the output of the subquery, and that wants to know what a Var actually refers to. That logic used to look at subquery->targetList, but in commit `3fc6e2d7f` I'd switched it to look at subroot->processed_tlist, ie the actual output of the subquery plan not the parser's idea of the result. It seemed like a good idea at the time :-(. As a band-aid fix, change it back. Really we ought to have an honest way of naming the outputs of SetOp steps, which suggests that it'd be a good idea for the parser to emit an RTE corresponding to each one. But that's a task for another day, and it certainly wouldn't yield a back-patchable fix. Report: https://postgr.es/m/20170407115808.25934.51866@wrigleys.postgresql.org	2017-04-07 12:18:38 -04:00
Heikki Linnakangas	b3721f7d13	Remove dead code and fix comments in fast-path function handling. HandleFunctionRequest() is no longer responsible for reading the protocol message from the client, since commit `2b3a8b20c2`. Fix the outdated comments. HandleFunctionRequest() now always returns 0, because the code that used to return EOF was moved in `2b3a8b20c2`. Therefore, the caller no longer needs to check the return value. Reported by Andres Freund. Backpatch to all supported versions, even though this doesn't have any user-visible effect, to make backporting future patches in this area easier. Discussion: https://www.postgresql.org/message-id/20170405010525.rt5azbya5fkbhvrx@alap3.anarazel.de	2017-04-06 09:11:11 +03:00
Tom Lane	fd52b88343	Fix integer-overflow problems in interval comparison. When using integer timestamps, the interval-comparison functions tried to compute the overall magnitude of an interval as an int64 number of microseconds. As reported by Frazer McLean, this overflows for intervals exceeding about 296000 years, which is bad since we nominally allow intervals many times larger than that. That results in wrong comparison results, and possibly in corrupted btree indexes for columns containing such large interval values. To fix, compute the magnitude as int128 instead. Although some compilers have native support for int128 calculations, many don't, so create our own support functions that can do 128-bit addition and multiplication if the compiler support isn't there. These support functions are designed with an eye to allowing the int128 code paths in numeric.c to be rewritten for use on all platforms, although this patch doesn't do that, or even provide all the int128 primitives that will be needed for it. Back-patch as far as 9.4. Earlier releases did not guard against overflow of interval values at all (commit `146604ec4` fixed that), so it seems not very exciting to worry about overly-large intervals for them. Before 9.6, we did not assume that unreferenced "static inline" functions would not draw compiler warnings, so omit functions not directly referenced by timestamp.c, the only present consumer of int128.h. (We could have omitted these functions in HEAD too, but since they were written and debugged on the way to the present patch, and they look likely to be needed by numeric.c, let's keep them in HEAD.) I did not bother to try to prevent such warnings in a --disable-integer-datetimes build, though. Before 9.5, configure will never define HAVE_INT128, so the part of int128.h that exploits a native int128 implementation is dead code in the 9.4 branch. I didn't bother to remove it, thinking that keeping the file looking similar in different branches is more useful. In HEAD only, add a simple test harness for int128.h in src/tools/. In back branches, this does not change the float-timestamps code path. That's not subject to the same kind of overflow risk, since it computes the interval magnitude as float8. (No doubt, when this code was originally written, overflow was disregarded for exactly that reason.) There is a precision hazard instead :-(, but we'll avert our eyes from that question, since no complaints have been reported and that code's deprecated anyway. Kyotaro Horiguchi and Tom Lane Discussion: https://postgr.es/m/1490104629.422698.918452336.26FA96B7@webmail.messagingengine.com	2017-04-05 23:51:28 -04:00
Magnus Hagander	b88b929a70	Back-patch checkpoint clarification docs and pg_basebackup updates This backpatches `51e26c9` and `7220c7b`, including both documentation updates clarifying the checkpoints at the beginning of base backups and the messages in verbose and progress mdoe of pg_basebackup. Author: Michael Banck Discussion: https://postgr.es/m/21444.1488142764%40sss.pgh.pa.us	2017-04-01 17:20:05 +02:00
Robert Haas	fb1879c374	Fix parallel query so it doesn't spoil row estimates above Gather. Commit `45be99f8cd` removed GatherPath's num_workers field, but this is entirely bogus. Normally, a path's parallel_workers flag is supposed to indicate the number of workers that it wants, and should be 0 for a non-partial path. In that commit, I mistakenly thought that GatherPath could also use that field to indicate the number of workers that it would try to start, but that's disastrous, because then it can propagate up to higher nodes in the plan tree, which will then get incorrect rowcounts because the parallel_workers flag is involved in computing those values. Repair by putting the separate field back. Report by Tomas Vondra. Patch by me, reviewed by Amit Kapila. Discussion: http://postgr.es/m/f91b4a44-f739-04bd-c4b6-f135bd643669@2ndquadrant.com	2017-03-31 21:10:30 -04:00
Robert Haas	9b6e8d8f86	Don't use bgw_main even to specify in-core bgworker entrypoints. On EXEC_BACKEND builds, this can fail if ASLR is in use. Backpatch to 9.5. On master, completely remove the bgw_main field completely, since there is no situation in which it is safe for an EXEC_BACKEND build. On 9.6 and 9.5, leave the field intact to avoid breaking things for third-party code that doesn't care about working under EXEC_BACKEND. Prior to 9.5, there are no in-core bgworker entrypoints. Petr Jelinek, reviewed by me. Discussion: http://postgr.es/m/09d8ad33-4287-a09b-a77f-77f8761adb5e@2ndquadrant.com	2017-03-31 20:47:50 -04:00
Tom Lane	8433e0b40e	Suppress implicit-conversion warnings seen with newer clang versions. We were assigning values near 255 through "char " pointers. On machines where char is signed, that's not entirely kosher, and it's reasonable for compilers to warn about it. A better solution would be to change the pointer type to "unsigned char ", but that would be vastly more invasive. For the moment, let's just apply this simple backpatchable solution. Aleksander Alekseev Discussion: https://postgr.es/m/20170220141239.GD12278@e733.localdomain Discussion: https://postgr.es/m/2839.1490714708@sss.pgh.pa.us	2017-03-28 13:16:19 -04:00
Tom Lane	c804c00d38	Fix unportable disregard of alignment requirements in RADIUS code. The compiler is entitled to store a char[] local variable with no particular alignment requirement. Our RADIUS code cavalierly took such a local variable and cast its address to a struct type that does have alignment requirements. On an alignment-picky machine this would lead to bus errors. To fix, declare the local variable honestly, and then cast its address to char * for use in the I/O calls. Given the lack of field complaints, there must be very few if any people affected; but nonetheless this is a clear portability issue, so back-patch to all supported branches. Noted while looking at a Coverity complaint in the same code.	2017-03-26 17:35:35 -04:00
Robert Haas	5674a258fd	plpgsql: Don't generate parallel plans for RETURN QUERY. Commit `7aea8e4f2d` allowed a parallel plan to be generated when for a RETURN QUERY or RETURN QUERY EXECUTE statement in a PL/pgsql block, but that's a bad idea because plplgsql asks the executor for 50 rows at a time. That means that we'll always be running serially a plan that was intended for parallel execution, which is not a good idea. Fix by not requesting a parallel plan from the outset. Per discussion, back-patch to 9.6. There is a slight risk that, due to optimizer error, somebody could have a case where the parallel plan executed serially is actually faster than the supposedly-best serial plan, but the consensus seems to be that that's not sufficient justification for leaving 9.6 unpatched. Discussion: http://postgr.es/m/CA+TgmoZ_ZuH+auEeeWnmtorPsgc_SmP+XWbDsJ+cWvWBSjNwDQ@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmobXEhvHbJtWDuPZM9bVSLiTj-kShxQJ2uM5GPDze9fRYA@mail.gmail.com	2017-03-24 12:39:07 -04:00
Teodor Sigaev	2ed391f95b	Fix pgbench options -C and -R together The bug is that prior to --rate doCustom was always disconnect/reconnect without exiting, but with rate it returns if it has to wait. However threadRun test whether there is a connection before recalling doCustom, so it was never called. Bug is not existed in head branch because of refactoring at `12788ae49e`, patch only 9.6 Author: Fabien Coelho Reviewed-by: me https://commitfest.postgresql.org/13/970/	2017-03-24 19:23:13 +03:00
Teodor Sigaev	8de6278d3b	Fix backup canceling Assert-enabled build crashes but without asserts it works by wrong way: it may not reset forcing full page write and preventing from starting exclusive backup with the same name as cancelled. Patch replaces pair of booleans nonexclusive_backup_running/exclusive_backup_running to single enum to correctly describe backup state. Backpatch to 9.6 where bug was introduced Reported-by: David Steele Authors: Michael Paquier, David Steele Reviewed-by: Anastasia Lubennikova https://commitfest.postgresql.org/13/1068/	2017-03-24 13:55:02 +03:00

... 3 4 5 6 7 ...

29377 commits