postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-02-27 20:00:50 -05:00

Author	SHA1	Message	Date
Tom Lane	50ee1c7462	Avoid searching for the target catcache in CatalogCacheIdInvalidate. A test case provided by Mathieu Fenniak shows that the initial search for the target catcache in CatalogCacheIdInvalidate consumes a very significant amount of overhead in cases where cache invalidation is triggered but has little useful work to do. There is no good reason for that search to exist at all, as the index array maintained by syscache.c allows direct lookup of the catcache from its ID. We just need a frontend function in syscache.c, matching the division of labor for most other cache-accessing operations. While there's more that can be done in this area, this patch alone reduces the runtime of Mathieu's example by 2X. We can hope that it offers some useful benefit in other cases too, although usually cache invalidation overhead is not such a striking fraction of the total runtime. Back-patch to 9.4 where logical decoding was introduced. It might be worth going further back, but presently the only case we know of where cache invalidation is really a significant burden is in logical decoding. Also, older branches have fewer catcaches, reducing the possible benefit. (Note: although this nominally changes catcache's API, we have always documented CatalogCacheIdInvalidate as a private function, so I would have little sympathy for an external module calling it directly. So backpatching should be fine.) Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com	2017-05-12 18:17:29 -04:00
Tom Lane	928c4de309	Fix dependencies for extended statistics objects. A stats object ought to have a dependency on each individual column it reads, not the entire table. Doing this honestly lets us get rid of the hard-wired logic in RemoveStatisticsExt, which seems to have been misguidedly modeled on RemoveStatistics; and it will be far easier to extend to multiple tables later. Also, add overlooked dependency on owner, and make the dependency on schema be NORMAL like every other such dependency. There remains some unfinished work here, which is to allow statistics objects to be extension members. That takes more effort than just adding the dependency call, though, so I left it out for now. initdb forced because this changes the set of pg_depend records that should exist for a statistics object. Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us	2017-05-12 16:26:31 -04:00
Alvaro Herrera	bc085205c8	Change CREATE STATISTICS syntax Previously, we had the WITH clause in the middle of the command, where you'd specify both generic options as well as statistic types. Few people liked this, so this commit changes it to remove the WITH keyword from that clause and makes it accept statistic types only. (We currently don't have any generic options, but if we invent in the future, we will gain a new WITH clause, probably at the end of the command). Also, the column list is now specified without parens, which makes the whole command look more similar to a SELECT command. This change will let us expand the command to supporting expressions (not just columns names) as well as multiple tables and their join conditions. Tom added lots of code comments and fixed some parts of the CREATE STATISTICS reference page, too; more changes in this area are forthcoming. He also fixed a potential problem in the alter_generic regression test, reducing verbosity on a cascaded drop to avoid dependency on message ordering, as we do in other tests. Tom also closed a security bug: we documented that table ownership was required in order to create a statistics object on it, but didn't actually implement it. Implement tab-completion for statistics objects. This can stand some more improvement. Authors: Alvaro Herrera, with lots of cleanup by Tom Lane Discussion: https://postgr.es/m/20170420212426.ltvgyhnefvhixm6i@alvherre.pgsql	2017-05-12 14:59:35 -03:00
Peter Eisentraut	d496a65790	Standardize "WAL location" terminology Other previously used terms were "WAL position" or "log position".	2017-05-12 13:51:27 -04:00
Peter Eisentraut	c1a7f64b4a	Replace "transaction log" with "write-ahead log" This makes documentation and error messages match the renaming of "xlog" to "wal" in APIs and file naming.	2017-05-12 11:52:43 -04:00
Peter Eisentraut	b807f59828	Rework the options syntax for logical replication commands For CREATE/ALTER PUBLICATION/SUBSCRIPTION, use similar option style as other statements that use a WITH clause for options. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2017-05-12 08:57:49 -04:00
Simon Riggs	024711bb54	Lag tracking for logical replication Lag tracking is called for each commit, but we introduce a pacing delay to ensure we don't swamp the lag tracker. Author: Petr Jelinek, with minor pacing delay code from me	2017-05-12 10:50:56 +01:00
Tom Lane	d10c626de4	Rename WAL-related functions and views to use "lsn" not "location". Per discussion, "location" is a rather vague term that could refer to multiple concepts. "LSN" is an unambiguous term for WAL locations and should be preferred. Some function names, view column names, and function output argument names used "lsn" already, but others used "location", as well as yet other terms such as "wal_position". Since we've already renamed a lot of things in this area from "xlog" to "wal" for v10, we may as well incur a bit more compatibility pain and make these names all consistent. David Rowley, minor additional docs hacking by me Discussion: https://postgr.es/m/CAKJS1f8O0njDKe8ePFQ-LK5-EjwThsDws6ohJ-+c6nWK+oUxtg@mail.gmail.com	2017-05-11 11:49:59 -04:00
Alvaro Herrera	b66adb7b0c	Revert "Permit dump/reload of not-too-large >1GB tuples" This reverts commits `fa2fa99552` and `42f50cb8fa`. While the functionality that was intended to be provided by these commits is desired, the patch didn't actually solve as many of the problematic situations as we hoped, and it created a bunch of its own problems. Since we're going to require more extensive changes soon for other reasons and users have been working around these problems for a long time already, there is no point in spending effort in fixing this halfway measure. Per complaint from Tom Lane. Discussion: https://postgr.es/m/21407.1484606922@sss.pgh.pa.us (Commit `fa2fa99552` had already been reverted in branches 9.5 as `f858524ee4` and 9.6 as `e9e44a0953`, so this touches master only. Commit `42f50cb8fa` was not present in the older branches.)	2017-05-10 18:41:27 -03:00
Peter Eisentraut	489b96e80b	Improve memory use in logical replication apply Previously, the memory used by the logical replication apply worker for processing messages would never be freed, so that could end up using a lot of memory. To improve that, change the existing ApplyContext memory context to ApplyMessageContext and reset that after every message (similar to MessageContext used elsewhere). For consistency of naming, rename the ApplyCacheContext to ApplyContext. Author: Stas Kelvich <s.kelvich@postgrespro.ru>	2017-05-09 14:51:49 -04:00
Peter Eisentraut	013c1178fd	Remove the NODROP SLOT option from DROP SUBSCRIPTION It turned out this approach had problems, because a DROP command should not have any options other than CASCADE and RESTRICT. Instead, always attempt to drop the slot if there is one configured, but also add an ALTER SUBSCRIPTION action to set the slot to NONE. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/29431.1493730652@sss.pgh.pa.us	2017-05-09 10:20:42 -04:00
Tom Lane	da07596006	Further patch rangetypes_selfuncs.c's statistics slot management. Values in a STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM slot are float8, not of the type of the column the statistics are for. This bug is at least partly the fault of sloppy specification comments for get_attstatsslot()/free_attstatsslot(): the type OID they want is that of the stavalues entries, not of the underlying column. (I double-checked other callers and they seem to get this right.) Adjust the comments to be more correct. Per buildfarm. Security: CVE-2017-7484	2017-05-08 15:03:14 -04:00
Peter Eisentraut	e2d4ef8de8	Add security checks to selectivity estimation functions Some selectivity estimation functions run user-supplied operators over data obtained from pg_statistic without security checks, which allows those operators to leak pg_statistic data without having privileges on the underlying tables. Fix by checking that one of the following is satisfied: (1) the user has table or column privileges on the table underlying the pg_statistic data, or (2) the function implementing the user-supplied operator is leak-proof. If neither is satisfied, planning will proceed as if there are no statistics available. At least one of these is satisfied in most cases in practice. The only situations that are negatively impacted are user-defined or not-leak-proof operators on a security-barrier view. Reported-by: Robert Haas <robertmhaas@gmail.com> Author: Peter Eisentraut <peter_e@gmx.net> Author: Tom Lane <tgl@sss.pgh.pa.us> Security: CVE-2017-7484	2017-05-08 09:26:32 -04:00
Heikki Linnakangas	eb61136dc7	Remove support for password_encryption='off' / 'plain'. Storing passwords in plaintext hasn't been a good idea for a very long time, if ever. Now seems like a good time to finally forbid it, since we're messing with this in PostgreSQL 10 anyway. Remove the CREATE/ALTER USER UNENCRYPTED PASSSWORD 'foo' syntax, since storing passwords unencrypted is no longer supported. ENCRYPTED PASSWORD 'foo' is still accepted, but ENCRYPTED is now just a noise-word, it does the same as just PASSWORD 'foo'. Likewise, remove the --unencrypted option from createuser, but accept --encrypted as a no-op for backward compatibility. AFAICS, --encrypted was a no-op even before this patch, because createuser encrypted the password before sending it to the server even if --encrypted was not specified. It added the ENCRYPTED keyword to the SQL command, but since the password was already in encrypted form, it didn't make any difference. The documentation was not clear on whether that was intended or not, but it's moot now. Also, while password_encryption='on' is still accepted as an alias for 'md5', it is now marked as hidden, so that it is not listed as an accepted value in error hints, for example. That's not directly related to removing 'plain', but it seems better this way. Reviewed by Michael Paquier Discussion: https://www.postgresql.org/message-id/16e9b768-fd78-0b12-cfc1-7b6b7f238fde@iki.fi	2017-05-08 11:26:07 +03:00
Peter Eisentraut	086221cf6b	Prevent panic during shutdown checkpoint When the checkpointer writes the shutdown checkpoint, it checks afterwards whether any WAL has been written since it started and throws a PANIC if so. At that point, only walsenders are still active, so one might think this could not happen, but walsenders can also generate WAL, for instance in BASE_BACKUP and certain variants of CREATE_REPLICATION_SLOT. So they can trigger this panic if such a command is run while the shutdown checkpoint is being written. To fix this, divide the walsender shutdown into two phases. First, the postmaster sends a SIGUSR2 signal to all walsenders. The walsenders then put themselves into the "stopping" state. In this state, they reject any new commands. (For simplicity, we reject all new commands, so that in the future we do not have to track meticulously which commands might generate WAL.) The checkpointer waits for all walsenders to reach this state before proceeding with the shutdown checkpoint. After the shutdown checkpoint is done, the postmaster sends SIGINT (previously unused) to the walsenders. This triggers the existing shutdown behavior of sending out the shutdown checkpoint record and then terminating. Author: Michael Paquier <michael.paquier@gmail.com> Reported-by: Fujii Masao <masao.fujii@gmail.com>	2017-05-05 10:31:42 -04:00
Heikki Linnakangas	0557a5dc2c	Make SCRAM salts and nonces longer. The salt is stored base64-encoded. With the old 10 bytes raw length, it was always padded to 16 bytes after encoding. We might as well use 12 raw bytes for the salt, and it's still encoded into 16 bytes. Similarly for the random nonces, use a raw length that's divisible by 3, so that there's no padding after base64 encoding. Make the nonces longer while we're at it. 10 bytes was probably enough to prevent replay attacks, but there's no reason to be skimpy here. Per suggestion from Álvaro Hernández Tortosa. Discussion: https://www.postgresql.org/message-id/df8c6e27-4d8e-5281-96e5-131a4e638fc8@8kdata.com	2017-05-05 10:02:13 +03:00
Heikki Linnakangas	e6e9c4da3a	Misc cleanup of SCRAM code. * Remove is_scram_verifier() function. It was unused. * Fix sanitize_char() function, used in error messages on protocol violations, to print bytes >= 0x7F correctly. * Change spelling of scram_MockSalt() function to be more consistent with the surroundings. * Change a few more references to "server proof" to "server signature" that I missed in commit `d981074c24`.	2017-05-05 10:01:44 +03:00
Heikki Linnakangas	8f8b9be51f	Add PQencryptPasswordConn function to libpq, use it in psql and createuser. The new function supports creating SCRAM verifiers, in addition to md5 hashes. The algorithm is chosen based on password_encryption, by default. This fixes the issue reported by Jeff Janes, that there was previously no way to create a SCRAM verifier with "\password". Michael Paquier and me Discussion: https://www.postgresql.org/message-id/CAMkU%3D1wfBgFPbfAMYZQE78p%3DVhZX7nN86aWkp0QcCp%3D%2BKxZ%3Dbg%40mail.gmail.com	2017-05-03 11:19:07 +03:00
Tom Lane	23c6eb0336	Remove create_singleton_array(), hard-coding the case in its sole caller. create_singleton_array() was not really as useful as we perhaps thought when we added it. It had never accreted more than one call site, and is only saving a dozen lines of code at that one, which is considerably less bulk than the function itself. Moreover, because of its insistence on using the caller's fn_extra cache space, it's arguably a coding hazard. text_to_array_internal() does not currently use fn_extra in any other way, but if it did it would be subtly broken, since the conflicting fn_extra uses could be needed within a single query, in the seldom-tested case that the field separator varies during the query. The same objection seems likely to apply to any other potential caller. The replacement code is a bit uglier, because it hardwires knowledge of the storage parameters of type TEXT, but it's not like we haven't got dozens or hundreds of other places that do the same. Uglier seems like a good tradeoff for smaller, faster, and safer. Per discussion with Neha Khatri. Discussion: https://postgr.es/m/CAFO0U+_fS5SRhzq6uPG+4fbERhoA9N2+nPrtvaC9mmeWivxbsA@mail.gmail.com	2017-05-02 20:41:37 -04:00
Tom Lane	92a43e4857	Reduce semijoins with unique inner relations to plain inner joins. If the inner relation can be proven unique, that is it can have no more than one matching row for any row of the outer query, then we might as well implement the semijoin as a plain inner join, allowing substantially more freedom to the planner. This is a form of outer join strength reduction, but it can't be implemented in reduce_outer_joins() because we don't have enough info about the individual relations at that stage. Instead do it much like remove_useless_joins(): once we've built base relations, we can make another pass over the SpecialJoinInfo list and get rid of any entries representing reducible semijoins. This is essentially a followon to the inner-unique patch (commit `9c7f5229a`) and makes use of the proof machinery that that patch created. We need only minor refactoring of innerrel_is_unique's API to support this usage. Per performance complaint from Teodor Sigaev. Discussion: https://postgr.es/m/f994fc98-389f-4a46-d1bc-c42e05cb43ed@sigaev.ru	2017-05-01 14:53:42 -04:00
Peter Eisentraut	9414e41ea7	Fix logical replication launcher wake up and reset After the logical replication launcher was told to wake up at commit (for example, by a CREATE SUBSCRIPTION command), the flag to wake up was not reset, so it would be woken up at every following commit as well. So fix that by resetting the flag. Also, we don't need to wake up anything if the transaction was rolled back. Just reset the flag in that case. Author: Masahiko Sawada <sawada.mshk@gmail.com> Reported-by: Fujii Masao <masao.fujii@gmail.com>	2017-05-01 10:18:09 -04:00
Robert Haas	e180c8aa8c	Fire per-statement triggers on partitioned tables. Even though no actual tuples are ever inserted into a partitioned table (the actual tuples are in the partitions, not the partitioned table itself), we still need to have a ResultRelInfo for the partitioned table, or per-statement triggers won't get fired. Amit Langote, per a report from Rajkumar Raghuwanshi. Reviewed by me. Discussion: http://postgr.es/m/CAKcux6%3DwYospCRY2J4XEFuVy0L41S%3Dfic7rmkbsU-GXhhSbmBg%40mail.gmail.com	2017-05-01 08:23:01 -04:00
Robert Haas	504c2205ab	Fix crash when partitioned column specified twice. Amit Langote, reviewed by Beena Emerson Discussion: http://postgr.es/m/6ed23d3d-c09d-4cbc-3628-0a8a32f750f4@lab.ntt.co.jp	2017-04-28 13:52:17 -04:00
Heikki Linnakangas	d981074c24	Misc SCRAM code cleanups. * Move computation of SaltedPassword to a separate function from scram_ClientOrServerKey(). This saves a lot of cycles in libpq, by computing SaltedPassword only once per authentication. (Computing SaltedPassword is expensive by design.) * Split scram_ClientOrServerKey() into two functions. Improves readability, by making the calling code less verbose. * Rename "server proof" to "server signature", to better match the nomenclature used in RFC 5802. * Rename SCRAM_SALT_LEN to SCRAM_DEFAULT_SALT_LEN, to make it more clear that the salt can be of any length, and the constant only specifies how long a salt we use when we generate a new verifier. Also rename SCRAM_ITERATIONS_DEFAULT to SCRAM_DEFAULT_ITERATIONS, for consistency. These things caught my eye while working on other upcoming changes.	2017-04-28 15:22:38 +03:00
Andres Freund	56e19d938d	Don't use on-disk snapshots for exported logical decoding snapshot. Logical decoding stores historical snapshots on disk, so that logical decoding can restart without having to reconstruct a snapshot from scratch (for which the resources are not guaranteed to be present anymore). These serialized snapshots were also used when creating a new slot via the walsender interface, which can export a "full" snapshot (i.e. one that can read all tables, not just catalog ones). The problem is that the serialized snapshots are only useful for catalogs and not for normal user tables. Thus the use of such a serialized snapshot could result in an inconsistent snapshot being exported, which could lead to queries returning wrong data. This would only happen if logical slots are created while another logical slot already exists. Author: Petr Jelinek Reviewed-By: Andres Freund Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com Backport: 9.4, where logical decoding was introduced.	2017-04-27 15:29:15 -07:00
Andres Freund	2bef06d516	Preserve required !catalog tuples while computing initial decoding snapshot. The logical decoding machinery already preserved all the required catalog tuples, which is sufficient in the course of normal logical decoding, but did not guarantee that non-catalog tuples were preserved during computation of the initial snapshot when creating a slot over the replication protocol. This could cause a corrupted initial snapshot being exported. The time window for issues is usually not terribly large, but on a busy server it's perfectly possible to it hit it. Ongoing decoding is not affected by this bug. To avoid increased overhead for the SQL API, only retain additional tuples when a logical slot is being created over the replication protocol. To do so this commit changes the signature of CreateInitDecodingContext(), but it seems unlikely that it's being used in an extension, so that's probably ok. In a drive-by fix, fix handling of ReplicationSlotsComputeRequiredXmin's already_locked argument, which should only apply to ProcArrayLock, not ReplicationSlotControlLock. Reported-By: Erik Rijkers Analyzed-By: Petr Jelinek Author: Petr Jelinek, heavily editorialized by Andres Freund Reviewed-By: Andres Freund Discussion: https://postgr.es/m/9a897b86-46e1-9915-ee4c-da02e4ff6a95@2ndquadrant.com Backport: 9.4, where logical decoding was introduced.	2017-04-27 13:13:36 -07:00
Simon Riggs	49e9281549	Rework handling of subtransactions in 2PC recovery The bug fixed by `0874d4f3e1` caused us to question and rework the handling of subtransactions in 2PC during and at end of recovery. Patch adds checks and tests to ensure no further bugs. This effectively removes the temporary measure put in place by `546c13e11b`. Author: Simon Riggs Reviewed-by: Tom Lane, Michael Paquier Discussion: http://postgr.es/m/CANP8+j+vvXmruL_i2buvdhMeVv5TQu0Hm2+C5N+kdVwHJuor8w@mail.gmail.com	2017-04-27 14:41:22 +02:00
Peter Eisentraut	de43897122	Fix various concurrency issues in logical replication worker launching The code was originally written with assumption that launcher is the only process starting the worker. However that hasn't been true since commit `7c4f52409` which failed to modify the worker management code adequately. This patch adds an in_use field to the LogicalRepWorker struct to indicate whether the worker slot is being used and uses proper locking everywhere this flag is set or read. However if the parent process dies while the new worker is starting and the new worker fails to attach to shared memory, this flag would never get cleared. We solve this rare corner case by adding a sort of garbage collector for in_use slots. This uses another field in the LogicalRepWorker struct named launch_time that contains the time when the worker was started. If any request to start a new worker does not find free slot, we'll check for workers that were supposed to start but took too long to actually do so, and reuse their slot. In passing also fix possible race conditions when stopping a worker that hasn't finished starting yet. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Fujii Masao <masao.fujii@gmail.com>	2017-04-26 10:45:59 -04:00
Tom Lane	64925603c9	Revert "Use pselect(2) not select(2), if available, to wait in postmaster's loop." This reverts commit `81069a9efc`. Buildfarm results suggest that some platforms have versions of pselect(2) that are not merely non-atomic, but flat out non-functional. Revert the use-pselect patch to confirm this diagnosis (and exclude the no-SA_RESTART patch as the source of trouble). If it's so, we should probably look into blacklisting specific platforms that have broken pselect. Discussion: https://postgr.es/m/9696.1493072081@sss.pgh.pa.us	2017-04-24 18:29:03 -04:00
Tom Lane	81069a9efc	Use pselect(2) not select(2), if available, to wait in postmaster's loop. Traditionally we've unblocked signals, called select(2), and then blocked signals again. The code expects that the select() will be cancelled with EINTR if an interrupt occurs; but there's a race condition, which is that an already-pending signal will be delivered as soon as we unblock, and then when we reach select() there will be nothing preventing it from waiting. This can result in a long delay before we perform any action that ServerLoop was supposed to have taken in response to the signal. As with the somewhat-similar symptoms fixed by commit `893902085`, the main practical problem is slow launching of parallel workers. The window for trouble is usually pretty short, corresponding to one iteration of ServerLoop; but it's not negligible. To fix, use pselect(2) in place of select(2) where available, as that's designed to solve exactly this problem. Where not available, we continue to use the old way, and are no worse off than before. pselect(2) has been required by POSIX since about 2001, so most modern platforms should have it. A bigger portability issue is that some implementations are said to be non-atomic, ie pselect() isn't really any different from unblock/select/reblock. Still, we're no worse off than before on such a platform. There is talk of rewriting the postmaster to use a WaitEventSet and not do signal response work in signal handlers, at which point this could be reverted, since we'd be using a self-pipe to solve the race condition. But that's not happening before v11 at the earliest. Back-patch to 9.6. The problem exists much further back, but the worst symptom arises only in connection with parallel query, so it does not seem worth taking any portability risks in older branches. Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us	2017-04-24 14:03:14 -04:00
Tom Lane	8939020853	Run the postmaster's signal handlers without SA_RESTART. The postmaster keeps signals blocked everywhere except while waiting for something to happen in ServerLoop(). The code expects that the select(2) will be cancelled with EINTR if an interrupt occurs; without that, followup actions that should be performed by ServerLoop() itself will be delayed. However, some platforms interpret the SA_RESTART signal flag as meaning that they should restart rather than cancel the select(2). Worse yet, some of them restart it with the original timeout delay, meaning that a steady stream of signal interrupts can prevent ServerLoop() from iterating at all if there are no incoming connection requests. Observable symptoms of this, on an affected platform such as HPUX 10, include extremely slow parallel query startup (possibly as much as 30 seconds) and failure to update timestamps on the postmaster's sockets and lockfiles when no new connections arrive for a long time. We can fix this by running the postmaster's signal handlers without SA_RESTART. That would be quite a scary change if the range of code where signals are accepted weren't so tiny, but as it is, it seems safe enough. (Note that postmaster children do, and must, reset all the handlers before unblocking signals; so this change should not affect any child process.) There is talk of rewriting the postmaster to use a WaitEventSet and not do signal response work in signal handlers, at which point it might be appropriate to revert this patch. But that's not happening before v11 at the earliest. Back-patch to 9.6. The problem exists much further back, but the worst symptom arises only in connection with parallel query, so it does not seem worth taking any portability risks in older branches. Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us	2017-04-24 13:00:30 -04:00
Fujii Masao	cbc2270e3f	Get rid of extern declarations of non-existent functions. Those extern declartions were mistakenly added by commit `7c4f52409`. Author: Petr Jelinek	2017-04-25 01:31:42 +09:00
Andres Freund	b182a4ae2f	Don't include sys/poll.h anymore. poll.h is mandated by Single Unix Spec v2, the usual baseline for postgres on unix. None of the unixoid buildfarms animals has sys/poll.h but not poll.h. Therefore there's not much point to test for sys/poll.h's existence and include it optionally. Author: Andres Freund, per suggestion from Tom Lane Discussion: https://postgr.es/m/20505.1492723662@sss.pgh.pa.us	2017-04-23 16:11:35 -07:00
Heikki Linnakangas	68e61ee72e	Change the on-disk format of SCRAM verifiers to conform to RFC 5803. It doesn't make any immediate difference to PostgreSQL, but might as well follow the standard, since one exists. (I looked at RFC 5803 earlier, but didn't fully understand it back then.) The new format uses Base64 instead of hex to encode StoredKey and ServerKey, which makes the verifiers slightly smaller. Using the same encoding for the salt and the keys also means that you only need one encoder/decoder instead of two. Although we have code in the backend to do both, we are talking about teaching libpq how to create SCRAM verifiers for PQencodePassword(), and libpq doesn't currently have any code for hex encoding. Bump catversion, because this renders any existing SCRAM verifiers in pg_authid invalid. Discussion: https://www.postgresql.org/message-id/351ba574-85ea-d9b8-9689-8c928dd0955d@iki.fi	2017-04-21 22:51:57 +03:00
Fujii Masao	88b0a31926	Mark some columns in pg_subscription as NOT NULL. In pg_subscription, subconninfo, subslotname, subsynccommit and subpublications are expected not to be NULL. Therefore this patch adds BKI_FORCE_NOT_NULL markings to them. This patch is basically unnecessary unless the code has a bug which wrongly sets either of those columns to NULL. But it's good to have this as a safeguard. Author: Masahiko Sawada Reviewed-by: Kyotaro Horiguchi Reported-by: Fujii Masao Discussion: http://postgr.es/m/CAHGQGwFDWh_Qr-q_GEMpD+qH=vYPMdVqw=ZOSY3kX_Pna9R9SA@mail.gmail.com	2017-04-20 23:35:30 +09:00
Tom Lane	39151781c8	Fix testing of parallel-safety of SubPlans. is_parallel_safe() supposed that the only relevant property of a SubPlan was the parallel safety of the referenced subplan tree. This is wrong: the testexpr or args subtrees might contain parallel-unsafe stuff, as demonstrated by the test case added here. However, just recursing into the subtrees fails in a different way: we'll typically find PARAM_EXEC Params representing the subplan's output columns in the testexpr. The previous coding supposed that any Param must be treated as parallel-restricted, so that a naive attempt at fixing this disabled parallel pushdown of SubPlans altogether. We must instead determine, for any visited Param, whether it is one that would be computed by a surrounding SubPlan node; if so, it's safe to push down along with the SubPlan node. We might later be able to extend this logic to cope with Params used for correlated subplans and other cases; but that's a task for v11 or beyond. Tom Lane and Amit Kapila Discussion: https://postgr.es/m/7064.1492022469@sss.pgh.pa.us	2017-04-18 15:43:56 -04:00
Tom Lane	e240a65c7d	Provide an error cursor for "can't call an SRF here" errors. Since it appears that v10 is going to move the goalposts by some amount in terms of where you can and can't invoke set-returning functions, arrange for the executor's "set-valued function called in context that cannot accept a set" errors to include a syntax position if possible, pointing to the specific SRF that can't be called where it's located. The main bit of infrastructure needed for this is to make the query source text accessible in the executor; but it turns out that commit `4c728f382` already did that. We just need a new function executor_errposition() modeled on parser_errposition(), and we're ready to rock. While experimenting with this, I noted that the error position wasn't properly reported if it occurred in a plpgsql FOR-over-query loop, which turned out to be because SPI_cursor_open_internal wasn't providing an error context callback during PortalStart. Fix that. There's a whole lot more that could be done with this infrastructure now that it's there, but this is not the right time in the development cycle for that sort of work. Hence, resist the temptation to plaster executor_errposition() calls everywhere ... for the moment. Discussion: https://postgr.es/m/5263.1492471571@sss.pgh.pa.us	2017-04-18 13:21:08 -04:00
Fujii Masao	280c53ecfb	A collection of small fixes for logical replication. * Be sure to reset the launcher's pid (LogicalRepCtx->launcher_pid) to 0 even when the launcher emits an error. * Declare ApplyLauncherWakeup() as a static function because it's called only in launcher.c. * Previously IsBackendPId() was used to check whether the launcher's pid was valid. IsBackendPid() was necessary because there was the bug where the launcher's pid was not reset to 0. But now it's fixed, so IsBackendPid() is not necessary and this patch removes it. Author: Masahiko Sawada Reviewed-by: Kyotaro Horiguchi Reported-by: Fujii Masao Discussion: http://postgr.es/m/CAHGQGwFDWh_Qr-q_GEMpD+qH=vYPMdVqw=ZOSY3kX_Pna9R9SA@mail.gmail.com	2017-04-19 02:16:34 +09:00
Heikki Linnakangas	c727f120ff	Rename "scram" to "scram-sha-256" in pg_hba.conf and password_encryption. Per discussion, plain "scram" is confusing because we actually implement SCRAM-SHA-256 rather than the original SCRAM that uses SHA-1 as the hash algorithm. If we add support for SCRAM-SHA-512 or some other mechanism in the SCRAM family in the future, that would become even more confusing. Most of the internal files and functions still use just "scram" as a shorthand for SCRMA-SHA-256, but I did change PASSWORD_TYPE_SCRAM to PASSWORD_TYPE_SCRAM_SHA_256, as that could potentially be used by 3rd party extensions that hook into the password-check hook. Michael Paquier did this in an earlier version of the SCRAM patch set already, but I didn't include that in the version that was committed. Discussion: https://www.postgresql.org/message-id/fde71ff1-5858-90c8-99a9-1c2427e7bafb@iki.fi	2017-04-18 14:50:50 +03:00
Alvaro Herrera	ee6922112e	Rename columns in new pg_statistic_ext catalog The new catalog reused a column prefix "sta" from pg_statistic, but this is undesirable, so change the catalog to use prefix "stx" instead. Also, rename the column that lists enabled statistic kinds as "stxkind" rather than "enabled". Discussion: https://postgr.es/m/CAKJS1f_2t5jhSN7huYRFH3w3rrHfG2QU7hiUHsu-Vdjd1rYT3w@mail.gmail.com	2017-04-17 18:34:29 -03:00
Tom Lane	bfba563bc5	Fix erroneous cross-reference in comment. Seems to have been introduced in commit `c219d9b0a`. I think there indeed was a "tupbasics.h" in some early drafts of that refactoring, but it didn't survive into the committed version. Amit Kapila	2017-04-15 14:22:26 -04:00
Tom Lane	32470825d3	Avoid passing function pointers across process boundaries. We'd already recognized that we can't pass function pointers across process boundaries for functions in loadable modules, since a shared library could get loaded at different addresses in different processes. But actually the practice doesn't work for functions in the core backend either, if we're using EXEC_BACKEND. This is the cause of recent failures on buildfarm member culicidae. Switch to passing a string function name in all cases. Something like this needs to be back-patched into 9.6, but let's see if the buildfarm likes it first. Petr Jelinek, with a bunch of basically-cosmetic adjustments by me Discussion: https://postgr.es/m/548f9c1d-eafa-e3fa-9da8-f0cc2f654e60@2ndquadrant.com	2017-04-14 23:50:16 -04:00
Tom Lane	2040bb4a0b	Clean up manipulations of hash indexes' hasho_flag field. Standardize on testing a hash index page's type by doing (opaque->hasho_flag & LH_PAGE_TYPE) == LH_xxx_PAGE Various places were taking shortcuts like opaque->hasho_flag & LH_BUCKET_PAGE which while not actually wrong, is still bad practice because it encourages use of opaque->hasho_flag & LH_UNUSED_PAGE which is wrong (LH_UNUSED_PAGE == 0, so the above is constant false). hash_xlog.c's hash_mask() contained such an incorrect test. This also ensures that we mask out the additional flag bits that hasho_flag has accreted since 9.6. pgstattuple's pgstat_hash_page(), for one, was failing to do that and was thus actively broken. Also fix assorted comments that hadn't been updated to reflect the extended usage of hasho_flag, and fix some macros that were testing just "(hasho_flag & bit)" to use the less dangerous, project-approved form "((hasho_flag & bit) != 0)". Coverity found the bug in hash_mask(); I noted the one in pgstat_hash_page() through code reading.	2017-04-14 17:04:25 -04:00
Peter Eisentraut	67c2def11d	Catversion bump for commit `887227a1cc`	2017-04-14 14:24:01 -04:00
Peter Eisentraut	887227a1cc	Add option to modify sync commit per subscription This also changes default behaviour of subscription workers to synchronous_commit = off. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2017-04-14 13:58:46 -04:00
Peter Eisentraut	d04eac1148	Make header self-contained Add necessary include files for things used in the header. (signal.h needed for sig_atomic_t.)	2017-04-13 21:47:24 -04:00
Heikki Linnakangas	4f3b87ab78	Improve the SASL authentication protocol. This contains some protocol changes to SASL authentiation (which is new in v10): * For future-proofing, in the AuthenticationSASL message that begins SASL authentication, provide a list of SASL mechanisms that the server supports, for the client to choose from. Currently, it's always just SCRAM-SHA-256. * Add a separate authentication message type for the final server->client SASL message, which the client doesn't need to respond to. This makes it unambiguous whether the client is supposed to send a response or not. The SASL mechanism should know that anyway, but better to be explicit. Also, in the server, support clients that don't send an Initial Client response in the first SASLInitialResponse message. The server is supposed to first send an empty request in that case, to which the client will respond with the data that usually comes in the Initial Client Response. libpq uses the Initial Client Response field and doesn't need this, and I would assume any other sensible implementation to use Initial Client Response, too, but let's follow the SASL spec. Improve the documentation on SASL authentication in protocol. Add a section describing the SASL message flow, and some details on our SCRAM-SHA-256 implementation. Document the different kinds of PasswordMessages that the frontend sends in different phases of SASL authentication, as well as GSS/SSPI authentication as separate message formats. Even though they're all 'p' messages, and the exact format depends on the context, describing them as separate message formats makes the documentation more clear. Reviewed by Michael Paquier and Álvaro Hernández Tortosa. Discussion: https://www.postgresql.org/message-id/CAB7nPqS-aFg0iM3AQOJwKDv_0WkAedRjs1W2X8EixSz+sKBXCQ@mail.gmail.com	2017-04-13 19:34:16 +03:00
Alvaro Herrera	27bcc372b1	Catversion bump forgotten in previous commit	2017-04-13 11:54:28 -03:00
Tom Lane	16ebab6886	Avoid transferring parallel-unsafe subplans to parallel workers. Commit `5e6d8d2bb` allowed parallel workers to execute parallel-safe subplans, but it transmitted the query's entire list of subplans to the worker(s). Since execMain.c blindly does ExecInitNode and later ExecEndNode on every list element, this resulted in parallel-unsafe plan nodes nonetheless getting started up and shut down in parallel workers. That seems mostly harmless as far as core plan node types go (but maybe not so much for Gather?). But it resulted in postgres_fdw opening and then closing extra remote connections, and it's likely that other non-parallel-safe FDWs or custom scan providers would have worse reactions. To fix, just make ExecSerializePlan replace parallel-unsafe subplans with NULLs in the cut-down plan tree that it transmits to workers. This relies on ExecInitNode and ExecEndNode to do nothing on NULL input, but they do anyway. If anything else is touching the dropped subplans in a parallel worker, that would be a bug to be fixed. (This thus provides a strong guarantee that we won't try to do something with a parallel-unsafe subplan in a worker.) This is, I think, the last fix directly occasioned by Andreas Seltenreich's bug report of a few days ago. Tom Lane and Amit Kapila Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de	2017-04-12 16:07:00 -04:00
Tom Lane	003d80f3df	Mark finished Plan nodes with parallel_safe flags. We'd managed to avoid doing this so far, but it seems pretty obvious that it would be forced on us some day, and this is much the cleanest way of approaching the open problem that parallel-unsafe subplans are being transmitted to parallel workers. Anyway there's no space cost due to alignment considerations, and the time cost is pretty minimal since we're just copying the flag from the corresponding Path node. (At least in most cases ... some of the klugier spots in createplan.c have to work a bit harder.) In principle we could perhaps get rid of SubPlan.parallel_safe, but I thought it better to keep that in case there are reasons to consider a SubPlan unsafe even when its child plan is parallel-safe. This patch doesn't actually do anything with the new flags, but I thought I'd commit it separately anyway. Note: although this touches outfuncs/readfuncs, there's no need for a catversion bump because Plan trees aren't stored on disk. Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de	2017-04-12 15:13:34 -04:00
Robert Haas	6599c9ac33	Add an Assert() to max_parallel_workers enforcement. To prevent future bugs along the lines of the one corrected by commit `8ff518699f`, or find any that remain in the current code, add an Assert() that the difference between parallel_register_count and parallel_terminate_count is in a sane range. Kuntal Ghosh, with considerable tidying-up by me, per a suggestion from Neha Khatri. Reviewed by Tomas Vondra. Discussion: http://postgr.es/m/CAFO0U+-E8yzchwVnvn5BeRDPgX2z9vZUxQ8dxx9c0XFGBC7N1Q@mail.gmail.com	2017-04-11 13:03:44 -04:00
Fujii Masao	ff7bce1743	Add max_sync_workers_per_subscription to postgresql.conf.sample. This commit also does - add REPLICATION_SUBSCRIBERS into config_group - mark max_logical_replication_workers and max_sync_workers_per_subscription as REPLICATION_SUBSCRIBERS parameters - move those parameters into "Subscribers" section in postgresql.conf.sample Author: Masahiko Sawada, Petr Jelinek and me Reported-by: Masahiko Sawada Discussion: http://postgr.es/m/CAD21AoAonSCoa=v=87ZO3vhfUZA1k_E2XRNHTt=xioWGUa+0ug@mail.gmail.com	2017-04-12 00:10:54 +09:00
Magnus Hagander	a4777f3556	Remove symbol WIN32_ONLY_COMPILER This used to mean "Visual C++ except in those parts where Borland C++ was supported where it meant one of those". Now that we don't support Borland C++ anymore, simplify by using _MSC_VER which is the normal way to detect Visual C++.	2017-04-11 15:22:21 +02:00
Magnus Hagander	6da56f3f84	Remove support for bcc and msvc standalone libpq builds This removes the support for building just libpq using Borland C++ or Visual C++. This has not worked properly for years, and given the number of complaints it's clearly not worth the maintenance burden. Building libpq using the standard MSVC build system is of course still supported, along with mingw.	2017-04-11 15:22:21 +02:00
Tom Lane	8f0530f580	Improve castNode notation by introducing list-extraction-specific variants. This extends the castNode() notation introduced by commit `5bcab1114` to provide, in one step, extraction of a list cell's pointer and coercion to a concrete node type. For example, "lfirst_node(Foo, lc)" is the same as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode that have appeared so far include a list extraction call, so this is pretty widely useful, and it saves a few more keystrokes compared to the old way. As with the previous patch, back-patch the addition of these macros to pg_list.h, so that the notation will be available when back-patching. Patch by me, after an idea of Andrew Gierth's. Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us	2017-04-10 13:51:53 -04:00
Peter Eisentraut	26ad194cb0	Support configuration reload in logical replication workers Author: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Fujii Masao <masao.fujii@gmail.com>	2017-04-10 13:42:21 -04:00
Robert Haas	c0a8ae7be3	Fix reporting of violations in ExecConstraints, again. We decided in `f1b4c771ea` to pass the original slot to ExecConstraints(), but that breaks when there are BEFORE ROW triggers involved. So we need to do reverse-map the tuples back to the original descriptor instead, as Amit originally proposed. Amit Langote, reviewed by Ashutosh Bapat. One overlooked comment fixed by me. Discussion: http://postgr.es/m/b3a17254-6849-e542-2353-bde4e880b6a4@lab.ntt.co.jp	2017-04-10 12:20:08 -04:00
Tom Lane	511540dadf	Move isolationtester's is-blocked query into C code for speed. Commit `4deb41381` modified isolationtester's query to see whether a session is blocked to also check for waits occurring in GetSafeSnapshot. However, it did that in a way that enormously increased the query's runtime under CLOBBER_CACHE_ALWAYS, causing the buildfarm members that use that to run about four times slower than before, and in some cases fail entirely. To fix, push the entire logic into a dedicated backend function. This should actually reduce the CLOBBER_CACHE_ALWAYS runtime from what it was previously, though I've not checked that. In passing, expose a SQL function to check for safe-snapshot blockage, comparable to pg_blocking_pids. This is more or less free given the infrastructure built to solve the other problem, so we might as well. Thomas Munro Discussion: https://postgr.es/m/20170407165749.pstcakbc637opkax@alap3.anarazel.de	2017-04-10 10:26:54 -04:00
Kevin Grittner	c63172d60f	Add GUCs for predicate lock promotion thresholds. Defaults match the fixed behavior of prior releases, but now DBAs have better options to tune serializable workloads. It might be nice to be able to set this per relation, but that part will need to wait for another release. Author: Dagfinn Ilmari Mannsåker	2017-04-07 21:38:05 -05:00
Tom Lane	9c7f5229ad	Optimize joins when the inner relation can be proven unique. If there can certainly be no more than one matching inner row for a given outer row, then the executor can move on to the next outer row as soon as it's found one match; there's no need to continue scanning the inner relation for this outer row. This saves useless scanning in nestloop and hash joins. In merge joins, it offers the opportunity to skip mark/restore processing, because we know we have not advanced past the first possible match for the next outer row. Of course, the devil is in the details: the proof of uniqueness must depend only on joinquals (not otherquals), and if we want to skip mergejoin mark/restore then it must depend only on merge clauses. To avoid adding more planning overhead than absolutely necessary, the present patch errs in the conservative direction: there are cases where inner_unique or skip_mark_restore processing could be used, but it will not do so because it's not sure that the uniqueness proof depended only on "safe" clauses. This could be improved later. David Rowley, reviewed and rather heavily editorialized on by me Discussion: https://postgr.es/m/CAApHDvqF6Sw-TK98bW48TdtFJ+3a7D2mFyZ7++=D-RyPsL76gw@mail.gmail.com	2017-04-07 22:20:13 -04:00
Andres Freund	f13a9121f9	Fix issues in `e8fdbd58fe`. When the 64bit atomics simulation is in use, we can't necessarily guarantee the correct alignment of the atomics due to lack of compiler support for doing so- that's fine from a safety perspective, because everything is protected by a lock, but we asserted the alignment in all cases. Weaken them. Per complaint from Alvaro Herrera. My #ifdefery for PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY wasn't sufficient. Fix that. Per complaint from Alexander Korotkov.	2017-04-07 17:09:03 -07:00
Alvaro Herrera	8bf74967da	Reduce the number of pallocs() in BRIN Instead of allocating memory in brin_deform_tuple and brin_copy_tuple over and over during a scan, allow reuse of previously allocated memory. This is said to make for a measurable performance improvement. Author: Jinyu Zhang, Álvaro Herrera Reviewed by: Tomas Vondra Discussion: https://postgr.es/m/495deb78.4186.1500dacaa63.Coremail.beijing_pg@163.com	2017-04-07 19:08:43 -03:00
Andres Freund	e8fdbd58fe	Improve 64bit atomics support. When adding atomics back in `b64d92f1a`, I added 64bit support as optional; there wasn't yet a direct user in sight. That turned out to be a bit short-sighted, it'd already have been useful a number of times. Add a fallback implementation of 64bit atomics, just like the one we have for 32bit atomics. Additionally optimize reads/writes to 64bit on a number of platforms where aligned writes of that size are atomic. This can now be tested with PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY. Author: Andres Freund Reviewed-By: Amit Kapila Discussion: https://postgr.es/m/20160330230914.GH13305@awork2.anarazel.de	2017-04-07 14:48:11 -07:00
Peter Eisentraut	0cb2e51992	Avoid using a C++ keyword in header file per cpluspluscheck	2017-04-07 16:32:02 -04:00
Alvaro Herrera	817cb10013	Fix new BRIN desummarize WAL record The WAL-writing piece was forgetting to set the pages-per-range value. Also, fix the declared type of struct member heapBlk, which I mistakenly set as OffsetNumber rather than BlockNumber. Problem was introduced by commit `c655899ba9` (April 1st). Any system that tries to replay the new WAL record written before this fix is likely to die on replay and require pg_resetwal. Reported by Tom Lane. Discussion: https://postgr.es/m/20191.1491524824@sss.pgh.pa.us	2017-04-07 17:11:56 -03:00
Robert Haas	d4116a7719	Add ProcArrayGroupUpdate wait event. Discussion: http://postgr.es/m/CA+TgmobgWHcXDcChX2+BqJDk2dkPVF85ZrJFhUyHHQmw8diTpA@mail.gmail.com	2017-04-07 13:41:47 -04:00
Heikki Linnakangas	60f11b87a2	Use SASLprep to normalize passwords for SCRAM authentication. An important step of SASLprep normalization, is to convert the string to Unicode normalization form NFKC. Unicode normalization requires a fairly large table of character decompositions, which is generated from data published by the Unicode consortium. The script to generate the table is put in src/common/unicode, as well test code for the normalization. A pre-generated version of the tables is included in src/include/common, so you don't need the code in src/common/unicode to build PostgreSQL, only if you wish to modify the normalization tables. The SASLprep implementation depends on the UTF-8 functions from src/backend/utils/mb/wchar.c. So to use it, you must also compile and link that. That doesn't change anything for the current users of these functions, the backend and libpq, as they both already link with wchar.o. It would be good to move those functions into a separate file in src/commmon, but I'll leave that for another day. No documentation changes included, because there is no details on the SCRAM mechanism in the docs anyway. An overview on that in the protocol specification would probably be good, even though SCRAM is documented in detail in RFC5802. I'll write that as a separate patch. An important thing to mention there is that we apply SASLprep even on invalid UTF-8 strings, to support other encodings. Patch by Michael Paquier and me. Discussion: https://www.postgresql.org/message-id/CAB7nPqSByyEmAVLtEf1KxTRh=PWNKiWKEKQR=e1yGehz=wbymQ@mail.gmail.com	2017-04-07 14:56:05 +03:00
Simon Riggs	ac2b095088	Reset API of clause_selectivity() Discussion: https://postgr.es/m/CAKJS1f9yurJQW9pdnzL+rmOtsp2vOytkpXKGnMFJEO-qz5O5eA@mail.gmail.com	2017-04-06 19:10:51 -04:00
Andres Freund	fa117ee403	Allow avoiding tuple copy within tuplesort_gettupleslot(). Add a "copy" argument to make it optional to receive a copy of caller tuple that is safe to use following a subsequent manipulating of tuplesort's state. This is a performance optimization. Most existing tuplesort_gettupleslot() callers are made to opt out of copying. Existing callers that happen to rely on the validity of tuple memory beyond subsequent manipulations of the tuplesort request their own copy. This brings tuplesort_gettupleslot() in line with tuplestore_gettupleslot(). In the future, a "copy" tuplesort_getdatum() argument may be added, that similarly allows callers to opt out of receiving their own copy of tuple. In passing, clarify assumptions that callers of other tuplesort fetch routines may make about tuple memory validity, per gripe from Tom Lane. Author: Peter Geoghegan Discussion: CAM3SWZQWZZ_N=DmmL7tKy_OUjGH_5mN=N=A6h7kHyyDvEhg2DA@mail.gmail.com	2017-04-06 14:48:59 -07:00
Alvaro Herrera	7e534adcdc	Fix BRIN cost estimation The original code was overly optimistic about the cost of scanning a BRIN index, leading to BRIN indexes being selected when they'd be a worse choice than some other index. This complete rewrite should be more accurate. Author: David Rowley, based on an earlier patch by Emre Hasegeli Reviewed-by: Emre Hasegeli Discussion: https://postgr.es/m/CAKJS1f9n-Wapop5Xz1dtGdpdqmzeGqQK4sV2MK-zZugfC14Xtw@mail.gmail.com	2017-04-06 17:51:53 -03:00
Peter Eisentraut	5f21f5292c	Mark immutable functions in information schema as parallel safe Also add opr_sanity check that all preloaded immutable functions are parallel safe. (Per discussion, this does not necessarily have to be true for all possible such functions, but deviations would be unlikely enough that maintaining such a test is reasonable.) Reported-by: David Rowley <david.rowley@2ndquadrant.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>	2017-04-06 14:30:13 -04:00
Peter Eisentraut	e6c9a5a9bc	Fix mixup of bool and ternary value Not currently a problem, but could be with stricter bool behavior under stdbool or C++. Reviewed-by: Andres Freund <andres@anarazel.de>	2017-04-06 13:09:42 -04:00
Alvaro Herrera	b1fc51a36e	Comment fixes for extended statistics Clean up some code comments in new extended statistics code, from `7b504eb282`.	2017-04-06 12:28:50 -03:00
Heikki Linnakangas	07044efe00	Remove bogus SCRAM_ITERATION_LEN constant. It was not used for what the comment claimed, at all. It was actually used as the 'base' argument to strtol(), when reading the iteration count. We don't need a constant for base-10, so remove it.	2017-04-06 17:41:48 +03:00
Simon Riggs	cd0cebaf7d	Always SnapshotResetXmin() during ClearTransaction() Avoid corner cases during 2PC with `6bad580d9e`	2017-04-06 10:30:22 -04:00
Peter Eisentraut	3217327053	Identity columns This is the SQL standard-conforming variant of PostgreSQL's serial columns. It fixes a few usability issues that serial columns have: - CREATE TABLE / LIKE copies default but refers to same sequence - cannot add/drop serialness with ALTER TABLE - dropping default does not drop sequence - need to grant separate privileges to sequence - other slight weirdnesses because serial is some kind of special macro Reviewed-by: Vitaly Burovoy <vitaly.burovoy@gmail.com>	2017-04-06 08:41:37 -04:00
Simon Riggs	6bad580d9e	Avoid SnapshotResetXmin() during AtEOXact_Snapshot() For normal commits and aborts we already reset PgXact->xmin, so we can simply avoid running SnapshotResetXmin() twice. During performance tests by Alexander Korotkov, diagnosis by Andres Freund showed PgXact array as a bottleneck. After manual analysis by me of the code paths that touch those memory locations, I was able to identify extraneous code in the main transaction commit path. Avoiding touching highly contented shmem improves concurrent performance slightly on all workloads, confirmed by tests run by Ashutosh Sharma and Alexander Korotkov. Simon Riggs Discussion: CANP8+jJdXE9b+b9F8CQT-LuxxO0PBCB-SZFfMVAdp+akqo4zfg@mail.gmail.com	2017-04-06 08:31:52 -04:00
Heikki Linnakangas	fd01983594	Remove dead code and fix comments in fast-path function handling. HandleFunctionRequest() is no longer responsible for reading the protocol message from the client, since commit `2b3a8b20c2`. Fix the outdated comments. HandleFunctionRequest() now always returns 0, because the code that used to return EOF was moved in `2b3a8b20c2`. Therefore, the caller no longer needs to check the return value. Reported by Andres Freund. Backpatch to all supported versions, even though this doesn't have any user-visible effect, to make backporting future patches in this area easier. Discussion: https://www.postgresql.org/message-id/20170405010525.rt5azbya5fkbhvrx@alap3.anarazel.de	2017-04-06 09:09:39 +03:00
Tom Lane	df1a699e5b	Fix integer-overflow problems in interval comparison. When using integer timestamps, the interval-comparison functions tried to compute the overall magnitude of an interval as an int64 number of microseconds. As reported by Frazer McLean, this overflows for intervals exceeding about 296000 years, which is bad since we nominally allow intervals many times larger than that. That results in wrong comparison results, and possibly in corrupted btree indexes for columns containing such large interval values. To fix, compute the magnitude as int128 instead. Although some compilers have native support for int128 calculations, many don't, so create our own support functions that can do 128-bit addition and multiplication if the compiler support isn't there. These support functions are designed with an eye to allowing the int128 code paths in numeric.c to be rewritten for use on all platforms, although this patch doesn't do that, or even provide all the int128 primitives that will be needed for it. Back-patch as far as 9.4. Earlier releases did not guard against overflow of interval values at all (commit `146604ec4` fixed that), so it seems not very exciting to worry about overly-large intervals for them. Before 9.6, we did not assume that unreferenced "static inline" functions would not draw compiler warnings, so omit functions not directly referenced by timestamp.c, the only present consumer of int128.h. (We could have omitted these functions in HEAD too, but since they were written and debugged on the way to the present patch, and they look likely to be needed by numeric.c, let's keep them in HEAD.) I did not bother to try to prevent such warnings in a --disable-integer-datetimes build, though. Before 9.5, configure will never define HAVE_INT128, so the part of int128.h that exploits a native int128 implementation is dead code in the 9.4 branch. I didn't bother to remove it, thinking that keeping the file looking similar in different branches is more useful. In HEAD only, add a simple test harness for int128.h in src/tools/. In back branches, this does not change the float-timestamps code path. That's not subject to the same kind of overflow risk, since it computes the interval magnitude as float8. (No doubt, when this code was originally written, overflow was disregarded for exactly that reason.) There is a precision hazard instead :-(, but we'll avert our eyes from that question, since no complaints have been reported and that code's deprecated anyway. Kyotaro Horiguchi and Tom Lane Discussion: https://postgr.es/m/1490104629.422698.918452336.26FA96B7@webmail.messagingengine.com	2017-04-05 23:51:27 -04:00
Simon Riggs	2686ee1b7c	Collect and use multi-column dependency stats Follow on patch in the multi-variate statistics patch series. CREATE STATISTICS s1 WITH (dependencies) ON (a, b) FROM t; ANALYZE; will collect dependency stats on (a, b) and then use the measured dependency in subsequent query planning. Commit `7b504eb282` added CREATE STATISTICS with n-distinct coefficients. These are now specified using the mutually exclusive option WITH (ndistinct). Author: Tomas Vondra, David Rowley Reviewed-by: Kyotaro HORIGUCHI, Álvaro Herrera, Dean Rasheed, Robert Haas and many other comments and contributions Discussion: https://postgr.es/m/56f40b20-c464-fad2-ff39-06b668fac47c@2ndquadrant.com	2017-04-05 18:00:42 -04:00
Peter Eisentraut	afd79873a0	Capitalize names of PLs consistently Author: Daniel Gustafsson <daniel@yesql.se>	2017-04-05 00:38:25 -04:00
Peter Eisentraut	193f5f9e91	pageinspect: Add bt_page_items function with bytea argument Author: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>	2017-04-04 23:52:55 -04:00
Kevin Grittner	5ebeb579b9	Follow-on cleanup for the transition table patch. Commit `59702716` added transition table support to PL/pgsql so that SQL queries in trigger functions could access those transient tables. In order to provide the same level of support for PL/perl, PL/python and PL/tcl, refactor the relevant code into a new function SPI_register_trigger_data. Call the new function in the trigger handler of all four PLs, and document it as a public SPI function so that authors of out-of-tree PLs can do the same. Also get rid of a second QueryEnvironment object that was maintained by PL/pgsql. That was previously used to deal with cursors, but the same approach wasn't appropriate for PLs that are less tangled up with core code. Instead, have SPI_cursor_open install the connection's current QueryEnvironment, as already happens for SPI_execute_plan. While in the docs, remove the note that transition tables were only supported in C and PL/pgSQL triggers, and correct some ommissions. Thomas Munro with some work by Kevin Grittner (mostly docs)	2017-04-04 18:36:39 -05:00
Simon Riggs	9a3215026b	Make min_wal_size/max_wal_size use MB internally Previously they were defined using multiples of XLogSegSize. Remove GUC_UNIT_XSEGS. Introduce GUC_UNIT_MB Extracted from patch series on XLogSegSize infrastructure. Beena Emerson	2017-04-04 18:00:01 -04:00
Simon Riggs	728bd991c3	Speedup 2PC recovery by skipping two phase state files in normal path 2PC state info held in shmem at PREPARE, then cleaned at COMMIT PREPARED/ABORT PREPARED, avoiding writing/fsyncing any state information to disk in the normal path, greatly enhancing replay speed. Prepared transactions that live past one checkpoint redo horizon will be written to disk as now. Similar conceptually to `978b2f65aa` and building upon the infrastructure created by that commit. Authors, in equal measure: Stas Kelvich, Nikhil Sontakke and Michael Paquier Discussion: https://postgr.es/m/CAMGcDxf8Bn9ZPBBJZba9wiyQq-Qk5uqq=VjoMnRnW5s+fKST3w@mail.gmail.com	2017-04-04 15:56:56 -04:00
Robert Haas	ea69a0dead	Expand hash indexes more gradually. Since hash indexes typically have very few overflow pages, adding a new splitpoint essentially doubles the on-disk size of the index, which can lead to large and abrupt increases in disk usage (and perhaps long delays on occasion). To mitigate this problem to some degree, divide larger splitpoints into four equal phases. This means that, for example, instead of growing from 4GB to 8GB all at once, a hash index will now grow from 4GB to 5GB to 6GB to 7GB to 8GB, which is perhaps still not as smooth as we'd like but certainly an improvement. This changes the on-disk format of the metapage, so bump HASH_VERSION from 2 to 3. This will force a REINDEX of all existing hash indexes, but that's probably a good idea anyway. First, hash indexes from pre-10 versions of PostgreSQL could easily be corrupted, and we don't want to confuse corruption carried over from an older release with any corruption caused despite the new write-ahead logging in v10. Second, it will let us remove some backward-compatibility code added by commit `293e24e507`. Mithun Cy, reviewed by Amit Kapila, Jesper Pedersen and me. Regression test outputs updated by me. Discussion: http://postgr.es/m/CAD__OuhG6F1gQLCgMQNnMNgoCvOLQZz9zKYJQNYvYmmJoM42gA@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoYty0jCf-pa+m+vYUJ716+AxM7nv_syvyanyf5O-L_i2A@mail.gmail.com	2017-04-03 23:46:33 -04:00
Robert Haas	7a39b5e4d1	Abstract logic to allow for multiple kinds of child rels. Currently, the only type of child relation is an "other member rel", which is the child of a baserel, but in the future joins and even upper relations may have child rels. To facilitate that, introduce macros that test to test for particular RelOptKind values, and use them in various places where they help to clarify the sense of a test. (For example, a test may allow RELOPT_OTHER_MEMBER_REL either because it intends to allow child rels, or because it intends to allow simple rels.) Also, remove find_childrel_top_parent, which will not work for a child rel that is not a baserel. Instead, add a new RelOptInfo member top_parent_relids to track the same kind of information in a more generic manner. Ashutosh Bapat, slightly tweaked by me. Review and testing of the patch set from which this was taken by Rajkumar Raghuwanshi and Rafia Sabih. Discussion: http://postgr.es/m/CA+TgmoagTnF2yqR3PT2rv=om=wJiZ4-A+ATwdnriTGku1CLYxA@mail.gmail.com	2017-04-03 22:41:31 -04:00
Peter Eisentraut	9fa6e08d4a	Make header self-contained Add necessary include files for things used in the header.	2017-04-03 16:17:45 -04:00
Tom Lane	f833c847b8	Allow psql variable substitution to occur in backtick command strings. Previously, text between backquotes in a psql metacommand's arguments was always passed to the shell literally. That considerably hobbles the usefulness of the feature for scripting, so we'd foreseen for a long time that we'd someday want to allow substitution of psql variables into the shell command. IMO the addition of \if metacommands has brought us to that point, since \if can greatly benefit from some sort of client-side expression evaluation capability, and psql itself is not going to grow any such thing in time for v10. Hence, this patch. It allows :VARIABLE to be replaced by the exact contents of the named variable, while :'VARIABLE' is replaced by the variable's contents suitably quoted to become a single shell-command argument. (The quoting rules for that are different from those for SQL literals, so this is a bit of an abuse of the :'VARIABLE' notation, but I doubt anyone will be confused.) As with other situations in psql, no substitution occurs if the word following a colon is not a known variable name. That limits the risk of compatibility problems for existing psql scripts; but the risk isn't zero, so this needs to be called out in the v10 release notes. Discussion: https://postgr.es/m/9561.1490895211@sss.pgh.pa.us	2017-04-01 21:44:54 -04:00
Kevin Grittner	41bd155dd6	Fix two undocumented parameters to functions from ENR patch. On ProcessUtility document the parameter, to match others. On CreateCachedPlan drop the queryEnv parameter. It was not referenced within the function, and had been added on the assumption that with some unknown future usage of QueryEnvironment it might be useful to do something there. We have avoided other "just in case" implementation of unused paramters, so drop it here. Per gripe from Tom Lane	2017-04-01 15:21:05 -05:00
Alvaro Herrera	c655899ba9	BRIN de-summarization When the BRIN summary tuple for a page range becomes too "wide" for the values actually stored in the table (because the tuples that were present originally are no longer present due to updates or deletes), it can be useful to remove the outdated summary tuple, so that a future summarization can install a tighter summary. This commit introduces a SQL-callable interface to do so. Author: Álvaro Herrera Reviewed-by: Eiji Seki Discussion: https://postgr.es/m/20170228045643.n2ri74ara4fhhfxf@alvherre.pgsql	2017-04-01 16:10:04 -03:00
Alvaro Herrera	7526e10224	BRIN auto-summarization Previously, only VACUUM would cause a page range to get initially summarized by BRIN indexes, which for some use cases takes too much time since the inserts occur. To avoid the delay, have brininsert request a summarization run for the previous range as soon as the first tuple is inserted into the first page of the next range. Autovacuum is in charge of processing these requests, after doing all the regular vacuuming/ analyzing work on tables. This doesn't impose any new tasks on autovacuum, because autovacuum was already in charge of doing summarizations. The only actual effect is to change the timing, i.e. that it occurs earlier. For this reason, we don't go any great lengths to record these requests very robustly; if they are lost because of a server crash or restart, they will happen at a later time anyway. Most of the new code here is in autovacuum, which can now be told about "work items" to process. This can be used for other things such as GIN pending list cleaning, perhaps visibility map bit setting, both of which are currently invoked during vacuum, but do not really depend on vacuum taking place. The requests are at the page range level, a granularity for which we did not have SQL-level access; we only had index-level summarization requests via brin_summarize_new_values(). It seems reasonable to add SQL-level access to range-level summarization too, so add a function brin_summarize_range() to do that. Authors: Álvaro Herrera, based on sketch from Simon Riggs. Reviewed-by: Thomas Munro. Discussion: https://postgr.es/m/20170301045823.vneqdqkmsd4as4ds@alvherre.pgsql	2017-04-01 14:00:53 -03:00
Kevin Grittner	18ce3a4ab2	Add infrastructure to support EphemeralNamedRelation references. A QueryEnvironment concept is added, which allows new types of objects to be passed into queries from parsing on through execution. At this point, the only thing implemented is a collection of EphemeralNamedRelation objects -- relations which can be referenced by name in queries, but do not exist in the catalogs. The only type of ENR implemented is NamedTuplestore, but provision is made to add more types fairly easily. An ENR can carry its own TupleDesc or reference a relation in the catalogs by relid. Although these features can be used without SPI, convenience functions are added to SPI so that ENRs can easily be used by code run through SPI. The initial use of all this is going to be transition tables in AFTER triggers, but that will be added to each PL as a separate commit. An incidental effect of this patch is to produce a more informative error message if an attempt is made to modify the contents of a CTE from a referencing DML statement. No tests previously covered that possibility, so one is added. Kevin Grittner and Thomas Munro Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro with valuable comments and suggestions from many others	2017-03-31 23:17:18 -05:00
Robert Haas	7d8f6986b8	Fix parallel query so it doesn't spoil row estimates above Gather. Commit `45be99f8cd` removed GatherPath's num_workers field, but this is entirely bogus. Normally, a path's parallel_workers flag is supposed to indicate the number of workers that it wants, and should be 0 for a non-partial path. In that commit, I mistakenly thought that GatherPath could also use that field to indicate the number of workers that it would try to start, but that's disastrous, because then it can propagate up to higher nodes in the plan tree, which will then get incorrect rowcounts because the parallel_workers flag is involved in computing those values. Repair by putting the separate field back. Report by Tomas Vondra. Patch by me, reviewed by Amit Kapila. Discussion: http://postgr.es/m/f91b4a44-f739-04bd-c4b6-f135bd643669@2ndquadrant.com	2017-03-31 21:01:20 -04:00
Robert Haas	2113ac4cbb	Don't use bgw_main even to specify in-core bgworker entrypoints. On EXEC_BACKEND builds, this can fail if ASLR is in use. Backpatch to 9.5. On master, completely remove the bgw_main field completely, since there is no situation in which it is safe for an EXEC_BACKEND build. On 9.6 and 9.5, leave the field intact to avoid breaking things for third-party code that doesn't care about working under EXEC_BACKEND. Prior to 9.5, there are no in-core bgworker entrypoints. Petr Jelinek, reviewed by me. Discussion: http://postgr.es/m/09d8ad33-4287-a09b-a77f-77f8761adb5e@2ndquadrant.com	2017-03-31 20:43:32 -04:00
Robert Haas	c94e6942ce	Don't allocate storage for partitioned tables. Also, don't allow setting reloptions on them, since that would have no effect given the lack of storage. The patch does this by introducing a new reloption kind for which there are currently no reloptions -- we might have some in the future -- so it adjusts parseRelOptions to handle that case correctly. Bumped catversion. System catalogs that contained reloptions for partitioned tables are no longer valid; plus, there are now fewer physical files on disk, which is not technically a catalog change but still a good reason to re-initdb. Amit Langote, reviewed by Maksim Milyutin and Kyotaro Horiguchi and revised a bit by me. Discussion: http://postgr.es/m/20170331.173326.212311140.horiguchi.kyotaro@lab.ntt.co.jp	2017-03-31 16:28:51 -04:00
Andrew Dunstan	e306df7f9c	Full Text Search support for json and jsonb The new functions are ts_headline() and to_tsvector. Dmitry Dolgov, edited and documented by me.	2017-03-31 14:26:03 -04:00
Andrew Dunstan	c80b9920fc	Transform or iterate over json(b) string values Dmitry Dolgov, reviewed and lightly edited by me.	2017-03-31 14:25:25 -04:00
Simon Riggs	25fff40798	Default monitoring roles Three nologin roles with non-overlapping privs are created by default * pg_read_all_settings - read all GUCs. * pg_read_all_stats - pg_stat_, pg_database_size(), pg_tablespace_size() pg_stat_scan_tables - may lock/scan tables Top level role - pg_monitor includes all of the above by default, plus others Author: Dave Page Reviewed-by: Stephen Frost, Robert Haas, Peter Eisentraut, Simon Riggs	2017-03-30 14:18:53 -04:00
Andres Freund	5ded4bd214	Remove support for version-0 calling conventions. The V0 convention is failure prone because we've so far assumed that a function is V0 if PG_FUNCTION_INFO_V1 is missing, leading to crashes if a function was coded against the V1 interface. V0 doesn't allow proper NULL, SRF and toast handling. V0 doesn't offer features that V1 doesn't. Thus remove V0 support and obsolete fmgr README contents relating to it. Author: Andres Freund, with contributions by Peter Eisentraut & Craig Ringer Reviewed-By: Peter Eisentraut, Craig Ringer Discussion: https://postgr.es/m/20161208213441.k3mbno4twhg2qf7g@alap3.anarazel.de	2017-03-30 06:25:46 -07:00
Teodor Sigaev	f90d23d0c5	Implement SortSupport for macaddr data type Introduces a scheme to produce abbreviated keys for the macaddr type. Bump catalog version. Author: Brandur Leach Reviewed-by: Julien Rouhaud, Peter Geoghegan https://commitfest.postgresql.org/13/743/	2017-03-29 23:28:56 +03:00
Peter Eisentraut	4fdb8a82e3	Update copyright year in recently added files Author: Masahiko Sawada <sawada.mshk@gmail.com>	2017-03-29 14:54:10 -04:00
Robert Haas	9a09527164	Mark more functions parallel-restricted. Commit `61c2e1a95f` allowed parallel query to be used in more places, revealing via buildfarm member mandrill that several functions intended to be called from triggers were incorrectly marked parallel-safe rather than parallel-restricted. Report by Tom Lane. Patch by Rafia Sabih. Reviewed by me. Discussion: http://postgr.es/m/16061.1490479253@sss.pgh.pa.us	2017-03-29 10:59:27 -04:00
Peter Eisentraut	3582b223d4	Fix hardcoded typeof check result for Windows The test result that I had blindly stipulated didn't work out on the build farm, so disable the feature in Windows MSVC for now.	2017-03-29 08:55:34 -04:00
Peter Eisentraut	4cb824699e	Cast result of copyObject() to correct type copyObject() is declared to return void , which allows easily assigning the result independent of the input, but it loses all type checking. If the compiler supports typeof or something similar, cast the result to the input type. This creates a greater amount of type safety. In some cases, where the result is assigned to a generic type such as Node or Expr *, new casts are now necessary, but in general casts are now unnecessary in the normal case and indicate that something unusual is happening. Reviewed-by: Mark Dilger <hornschnorter@gmail.com>	2017-03-28 21:59:23 -04:00
Alvaro Herrera	ce96ce60ca	Remove direct uses of ItemPointer.{ip_blkid,ip_posid} There are no functional changes here; this simply encapsulates knowledge of the ItemPointerData struct so that a future patch can change things without more breakage. All direct users of ip_blkid and ip_posid are changed to use existing macros ItemPointerGetBlockNumber and ItemPointerGetOffsetNumber respectively. For callers where that's inappropriate (because they Assert that the itempointer is is valid-looking), add ItemPointerGetBlockNumberNoCheck and ItemPointerGetOffsetNumberNoCheck, which lack the assertion but are otherwise identical. Author: Pavan Deolasee Discussion: https://postgr.es/m/CABOikdNnFon4cJiL=h1mZH3bgUeU+sWHuU4Yr8AB=j3A2p1GiA@mail.gmail.com	2017-03-28 19:02:23 -03:00
Teodor Sigaev	ab89e465cb	Altering default privileges on schemas Extend ALTER DEFAULT PRIVILEGES command to schemas. Author: Matheus Oliveira Reviewed-by: Petr Jelínek, Ashutosh Sharma https://commitfest.postgresql.org/13/887/	2017-03-28 18:58:55 +03:00
Simon Riggs	ff539da316	Cleanup slots during drop database Automatically drop all logical replication slots associated with a database when the database is dropped. Previously we threw an ERROR if a slot existed. Now we throw ERROR only if a slot is active in the database being dropped. Craig Ringer	2017-03-28 10:05:21 -04:00
Alvaro Herrera	6462238f0d	Fix uninitialized memory propagation mistakes Valgrind complains that some uninitialized bytes are being passed around by the extended statistics code since commit `7b504eb282`, as reported by Andres Freund. Silence it. Tomas Vondra submitted a patch which he verified to fix the complaints in his machine; however I messed with it a bit before pushing, so any remaining problems are likely my (Álvaro's) fault. Author: Tomas Vondra Discussion: https://postgr.es/m/20170325211031.4xxoptigqxm2emn2@alap3.anarazel.de	2017-03-27 14:52:19 -03:00
Robert Haas	c4c51541e2	Still more code review for single-page hash vacuuming. Most seriously, fix use of incorrect block ID, per a report from Jeff Janes that it causes a crash and a diagnosis from Amit Kapila. Improve consistency between the hash and btree versions of this code by adding back a PANIC that btree has, and by registering data in the xlog record in the same way, per complaints from Jeff Janes and Amit Kapila. Tidy up some minor cosmetic points, per complaints from Amit Kapila. Patch by Ashutosh Sharma, reviewed by Amit Kapila, and tested by Jeff Janes. Discussion: http://postgr.es/m/CAMkU=1w-9Qe=Ff1o6bSaXpNO9wqpo7_9GL8_CVhw4BoVVHasqg@mail.gmail.com	2017-03-27 12:51:10 -04:00
Teodor Sigaev	1b02be21f2	Fsync directory after creating or unlinking file. If file was created/deleted just before powerloss it's possible that file system will miss that. To prevent it, call fsync() where creating/ unlinkg file is critical. Author: Michael Paquier Reviewed-by: Ashutosh Bapat, Takayuki Tsunakawa, me	2017-03-27 19:33:01 +03:00
Andrew Gierth	b5635948ab	Support hashed aggregation with grouping sets. This extends the Aggregate node with two new features: HashAggregate can now run multiple hashtables concurrently, and a new strategy MixedAggregate populates hashtables while doing sorted grouping. The planner will now attempt to save as many sorts as possible when planning grouping sets queries, while not exceeding work_mem for the estimated combined sizes of all hashtables used. No SQL-level changes are required. There should be no user-visible impact other than the new EXPLAIN output and possible changes to result ordering when ORDER BY was not used (which affected a few regression tests). The enable_hashagg option is respected. Author: Andrew Gierth Reviewers: Mark Dilger, Andres Freund Discussion: https://postgr.es/m/87vatszyhj.fsf@news-spur.riddles.org.uk	2017-03-27 04:20:54 +01:00
Robert Haas	fc70a4b0df	Show more processes in pg_stat_activity. Previously, auxiliary processes and background workers not connected to a database (such as the logical replication launcher) weren't shown. Include them, so that we can see the associated wait state information. Add a new column to identify the processes type, so that people can filter them out easily using SQL if they wish. Before this patch was written, there was discussion about whether we should expose this information in a separate view, so as to avoid contaminating pg_stat_activity with things people might not want to see. But putting everything in pg_stat_activity was a more popular choice, so that's what the patch does. Kuntal Ghosh, reviewed by Amit Langote and Michael Paquier. Some revisions and bug fixes by me. Discussion: http://postgr.es/m/CA+TgmoYES5nhkEGw9nZXU8_FhA8XEm8NTm3-SO+3ML1B81Hkww@mail.gmail.com	2017-03-26 22:02:22 -04:00
Tom Lane	2f0903ea19	Improve performance of ExecEvalWholeRowVar. In commit `b8d7f053c`, we needed to fix ExecEvalWholeRowVar to not change the state of the slot it's copying. The initial quick hack at that required two rounds of tuple construction, which is not very nice. To fix, add another primitive to tuptoaster.c that does precisely what we need. (I initially tried to do this by refactoring one of the existing functions into two pieces; but it looked like that might hurt performance for the existing case, and the amount of code that could be shared is not very large, so I gave up on that.) Discussion: https://postgr.es/m/26088.1490315792@sss.pgh.pa.us	2017-03-26 19:14:57 -04:00
Peter Eisentraut	895f93701f	Fix cpluspluscheck warning Structure tag cannot be the same as a typedef that is a pointer to that struct.	2017-03-26 18:31:05 -04:00
Tom Lane	5459cfd3ad	Fix typos in logical replication support for initial data copy. Fix an incorrect assert condition (noted by Coverity), and spell the new name of the function correctly. Typos introduced in commit `7c4f52409`. Michael Paquier	2017-03-26 17:44:35 -04:00
Andres Freund	b8d7f053c5	Faster expression evaluation and targetlist projection. This replaces the old, recursive tree-walk based evaluation, with non-recursive, opcode dispatch based, expression evaluation. Projection is now implemented as part of expression evaluation. This both leads to significant performance improvements, and makes future just-in-time compilation of expressions easier. The speed gains primarily come from: - non-recursive implementation reduces stack usage / overhead - simple sub-expressions are implemented with a single jump, without function calls - sharing some state between different sub-expressions - reduced amount of indirect/hard to predict memory accesses by laying out operation metadata sequentially; including the avoidance of nearly all of the previously used linked lists - more code has been moved to expression initialization, avoiding constant re-checks at evaluation time Future just-in-time compilation (JIT) has become easier, as demonstrated by released patches intended to be merged in a later release, for primarily two reasons: Firstly, due to a stricter split between expression initialization and evaluation, less code has to be handled by the JIT. Secondly, due to the non-recursive nature of the generated "instructions", less performance-critical code-paths can easily be shared between interpreted and compiled evaluation. The new framework allows for significant future optimizations. E.g.: - basic infrastructure for to later reduce the per executor-startup overhead of expression evaluation, by caching state in prepared statements. That'd be helpful in OLTPish scenarios where initialization overhead is measurable. - optimizing the generated "code". A number of proposals for potential work has already been made. - optimizing the interpreter. Similarly a number of proposals have been made here too. The move of logic into the expression initialization step leads to some backward-incompatible changes: - Function permission checks are now done during expression initialization, whereas previously they were done during execution. In edge cases this can lead to errors being raised that previously wouldn't have been, e.g. a NULL array being coerced to a different array type previously didn't perform checks. - The set of domain constraints to be checked, is now evaluated once during expression initialization, previously it was re-built every time a domain check was evaluated. For normal queries this doesn't change much, but e.g. for plpgsql functions, which caches ExprStates, the old set could stick around longer. The behavior around might still change. Author: Andres Freund, with significant changes by Tom Lane, changes by Heikki Linnakangas Reviewed-By: Tom Lane, Heikki Linnakangas Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de	2017-03-25 14:52:06 -07:00
Simon Riggs	5737c12df0	Report catalog_xmin separately in hot_standby_feedback If the upstream walsender is using a physical replication slot, store the catalog_xmin in the slot's catalog_xmin field. If the upstream doesn't use a slot and has only a PGPROC entry behaviour doesn't change, as we store the combined xmin and catalog_xmin in the PGPROC entry. Author: Craig Ringer	2017-03-25 14:07:27 +00:00
Peter Eisentraut	7678fe1c6d	Make header self-contained Add necessary include files for things used in the header.	2017-03-24 23:36:10 -04:00
Simon Riggs	3428ef7911	Reverting `42b4b0b241` Buildfarm issues and other reported issues	2017-03-24 17:56:17 +00:00
Alvaro Herrera	7b504eb282	Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql	2017-03-24 14:06:10 -03:00
Robert Haas	857ee8e391	Add a txid_status function. If your connection to the database server is lost while a COMMIT is in progress, it may be difficult to figure out whether the COMMIT was successful or not. This function will tell you, provided that you don't wait too long to ask. It may be useful in other situations, too. Craig Ringer, reviewed by Simon Riggs and by me Discussion: http://postgr.es/m/CAMsr+YHQiWNEi0daCTboS40T+V5s_+dst3PYv_8v2wNVH+Xx4g@mail.gmail.com	2017-03-24 12:00:53 -04:00
Simon Riggs	42b4b0b241	Avoid SnapshotResetXmin() during AtEOXact_Snapshot() For normal commits and aborts we already reset PgXact->xmin Avoiding touching highly contented shmem improves concurrent performance. Simon Riggs Discussion: CANP8+jJdXE9b+b9F8CQT-LuxxO0PBCB-SZFfMVAdp+akqo4zfg@mail.gmail.com	2017-03-24 14:20:59 +00:00
Heikki Linnakangas	7ac955b347	Allow SCRAM authentication, when pg_hba.conf says 'md5'. If a user has a SCRAM verifier in pg_authid.rolpassword, there's no reason we cannot attempt to perform SCRAM authentication instead of MD5. The worst that can happen is that the client doesn't support SCRAM, and the authentication will fail. But previously, it would fail for sure, because we would not even try. SCRAM is strictly more secure than MD5, so there's no harm in trying it. This allows for a more graceful transition from MD5 passwords to SCRAM, as user passwords can be changed to SCRAM verifiers incrementally, without changing pg_hba.conf. Refactor the code in auth.c to support that better. Notably, we now have to look up the user's pg_authid entry before sending the password challenge, also when performing MD5 authentication. Also simplify the concept of a "doomed" authentication. Previously, if a user had a password, but it had expired, we still performed SCRAM authentication (but always returned error at the end) using the salt and iteration count from the expired password. Now we construct a fake salt, like we do when the user doesn't have a password or doesn't exist at all. That simplifies get_role_password(), and we can don't need to distinguish the "user has expired password", and "user does not exist" cases in auth.c. On second thoughts, also rename uaSASL to uaSCRAM. It refers to the mechanism specified in pg_hba.conf, and while we use SASL for SCRAM authentication at the protocol level, the mechanism should be called SCRAM, not SASL. As a comparison, we have uaLDAP, even though it looks like the plain 'password' authentication at the protocol level. Discussion: https://www.postgresql.org/message-id/6425.1489506016@sss.pgh.pa.us Reviewed-by: Michael Paquier	2017-03-24 13:32:21 +02:00
Teodor Sigaev	78874531ba	Fix backup canceling Assert-enabled build crashes but without asserts it works by wrong way: it may not reset forcing full page write and preventing from starting exclusive backup with the same name as cancelled. Patch replaces pair of booleans nonexclusive_backup_running/exclusive_backup_running to single enum to correctly describe backup state. Backpatch to 9.6 where bug was introduced Reported-by: David Steele Authors: Michael Paquier, David Steele Reviewed-by: Anastasia Lubennikova https://commitfest.postgresql.org/13/1068/	2017-03-24 13:53:40 +03:00
Tom Lane	457a444873	Avoid syntax error on platforms that have neither LOCALE_T nor ICU. Buildfarm member anole sees this union as empty, and doesn't like it.	2017-03-23 23:18:52 -04:00
Robert Haas	c23b186ae9	Fix enum definition. Commit `249cf070e3` assigned to one of the labels in the middle the value that should have been assigned to the first member of the enum. Rushabh's patch didn't have that defect as submitted, but I managed to mess it up while editing. Repair.	2017-03-23 16:10:43 -04:00
Peter Eisentraut	eccfef81e1	ICU support Add a column collprovider to pg_collation that determines which library provides the collation data. The existing choices are default and libc, and this adds an icu choice, which uses the ICU4C library. The pg_locale_t type is changed to a union that contains the provider-specific locale handles. Users of locale information are changed to look into that struct for the appropriate handle to use. Also add a collversion column that records the version of the collation when it is created, and check at run time whether it is still the same. This detects potentially incompatible library upgrades that can corrupt indexes and other structures. This is currently only supported by ICU-provided collations. initdb initializes the default collation set as before from the `locale -a` output but also adds all available ICU locales with a "-x-icu" appended. Currently, ICU-provided collations can only be explicitly named collations. The global database locales are still always libc-provided. ICU support is enabled by configure --with-icu. Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com> Reviewed-by: Andreas Karlsson <andreas@proxel.se>	2017-03-23 15:28:48 -04:00
Robert Haas	ea42cc18c3	Track the oldest XID that can be safely looked up in CLOG. This provides infrastructure for looking up arbitrary, user-supplied XIDs without a risk of scary-looking failures from within the clog module. Normally, the oldest XID that can be safely looked up in CLOG is the same as the oldest XID that can reused without causing wraparound, and the latter is already tracked. However, while truncation is in progress, the values are different, so we must keep track of them separately. Craig Ringer, reviewed by Simon Riggs and by me. Discussion: http://postgr.es/m/CAMsr+YHQiWNEi0daCTboS40T+V5s_+dst3PYv_8v2wNVH+Xx4g@mail.gmail.com	2017-03-23 14:26:31 -04:00
Robert Haas	691b8d5928	Allow for parallel execution whenever ExecutorRun() is done only once. Previously, it was unsafe to execute a plan in parallel if ExecutorRun() might be called with a non-zero row count. However, it's quite easy to fix things up so that we can support that case, provided that it is known that we will never call ExecutorRun() a second time for the same QueryDesc. Add infrastructure to signal this, and cross-checks to make sure that a caller who claims this is true doesn't later reneg. While that pattern never happens with queries received directly from a client -- there's no way to know whether multiple Execute messages will be sent unless the first one requests all the rows -- it's pretty common for queries originating from procedural languages, which often limit the result to a single tuple or to a user-specified number of tuples. This commit doesn't actually enable parallelism in any additional cases, because currently none of the places that would be able to benefit from this infrastructure pass CURSOR_OPT_PARALLEL_OK in the first place, but it makes it much more palatable to pass CURSOR_OPT_PARALLEL_OK in places where we currently don't, because it eliminates some cases where we'd end up having to run the parallel plan serially. Patch by me, based on some ideas from Rafia Sabih and corrected by Rafia Sabih based on feedback from Dilip Kumar and myself. Discussion: http://postgr.es/m/CA+TgmobXEhvHbJtWDuPZM9bVSLiTj-kShxQJ2uM5GPDze9fRYA@mail.gmail.com	2017-03-23 13:14:36 -04:00
Teodor Sigaev	218f51584d	Reduce page locking in GIN vacuum GIN vacuum during cleaning posting tree can lock this whole tree for a long time with by holding LockBufferForCleanup() on root. Patch changes it with two ways: first, cleanup lock will be taken only if there is an empty page (which should be deleted) and, second, it tries to lock only subtree, not the whole posting tree. Author: Andrey Borodin with minor editorization by me Reviewed-by: Jeff Davis, me https://commitfest.postgresql.org/13/896/	2017-03-23 19:38:47 +03:00
Peter Eisentraut	73561013e5	Remove trailing comma from enum definition Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2017-03-23 11:58:11 -04:00
Peter Eisentraut	128e6ee01d	Assorted compilation and test fixes related to `7c4f52409a`, per build farm Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2017-03-23 11:44:43 -04:00
Simon Riggs	6912acc04f	Replication lag tracking for walsenders Adds write_lag, flush_lag and replay_lag cols to pg_stat_replication. Implements a lag tracker module that reports the lag times based upon measurements of the time taken for recent WAL to be written, flushed and replayed and for the sender to hear about it. These times represent the commit lag that was (or would have been) introduced by each synchronous commit level, if the remote server was configured as a synchronous standby. For an asynchronous standby, the replay_lag column approximates the delay before recent transactions became visible to queries. If the standby server has entirely caught up with the sending server and there is no more WAL activity, the most recently measured lag times will continue to be displayed for a short time and then show NULL. Physical replication lag tracking is automatic. Logical replication tracking is possible but is the responsibility of the logical decoding plugin. Tracking is a private module operating within each walsender individually, with values reported to shared memory. Module not used outside of walsender. Design and code is good enough now to commit - kudos to the author. In many ways a difficult topic, with important and subtle behaviour so this shoudl be expected to generate discussion and multiple open items: Test now! Author: Thomas Munro, following designs by Fujii Masao and Simon Riggs Review: Simon Riggs, Ian Barwick and Craig Ringer	2017-03-23 14:05:28 +00:00
Peter Eisentraut	7c4f52409a	Logical replication support for initial data copy Add functionality for a new subscription to copy the initial data in the tables and then sync with the ongoing apply process. For the copying, add a new internal COPY option to have the COPY source data provided by a callback function. The initial data copy works on the subscriber by receiving COPY data from the publisher and then providing it locally into a COPY that writes to the destination table. A WAL receiver can now execute full SQL commands. This is used here to obtain information about tables and publications. Several new options were added to CREATE and ALTER SUBSCRIPTION to control whether and when initial table syncing happens. Change pg_dump option --no-create-subscription-slots to --no-subscription-connect and use the new CREATE SUBSCRIPTION ... NOCONNECT option for that. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Tested-by: Erik Rijkers <er@xs4all.nl>	2017-03-23 08:55:37 -04:00
Stephen Frost	017e4f2588	Expose waitforarchive option through pg_stop_backup() Internally, we have supported the option to either wait for all of the WAL associated with a backup to be archived, or to return immediately. This option is useful to users of pg_stop_backup() as well, when they are reading the stop backup record position and checking that the WAL they need has been archived independently. This patch adds an additional, optional, argument to pg_stop_backup() which allows the user to indicate if they wish to wait for the WAL to be archived or not. The default matches current behavior, which is to wait. Author: David Steele, with some minor changes, doc updates by me. Reviewed by: Takayuki Tsunakawa, Fujii Masao Discussion: https://postgr.es/m/758e3fd1-45b4-5e28-75cd-e9e7f93a4c02@pgmasters.net	2017-03-22 23:44:58 -04:00
Magnus Hagander	6b76f1bb58	Support multiple RADIUS servers This changes all the RADIUS related parameters (radiusserver, radiussecret, radiusport, radiusidentifier) to be plural and to accept a comma separated list of servers, which will be tried in order. Reviewed by Adam Brightwell	2017-03-22 18:11:08 +01:00
Simon Riggs	af4b1a0869	Refactor GetOldestXmin() to use flags Replace ignoreVacuum parameter with more flexible flags. Author: Eiji Seki Review: Haribabu Kommi	2017-03-22 16:51:01 +00:00
Andrew Dunstan	96a7128b7b	Sync pg_dump and pg_dumpall output Before exiting any files are fsync'ed. A --no-sync option is also provided for a faster exit if desired. Michael Paquier. Reviewed by Albe Laurenz Discussion: https://postgr.es/m/CAB7nPqS1uZ=Ov+UruW6jr3vB-S_DLVMPc0dQpV-fTDjmm0ZQMg@mail.gmail.com	2017-03-22 10:20:13 -04:00
Simon Riggs	9b013dc238	Improve performance of replay of AccessExclusiveLocks A hot standby replica keeps a list of Access Exclusive locks for a top level transaction. These locks are released when the top level transaction ends. Searching of this list is O(N^2), and each transaction had to pay the price of searching this list for locks, even if it didn't take any AE locks itself. This patch optimizes this case by having the master server track which transactions took AE locks, and passes that along to the standby server in the commit/abort record. This allows the standby to only try to release locks for transactions which actually took any, avoiding the majority of the performance issue. Refactor MyXactAccessedTempRel into MyXactFlags to allow minimal additional cruft with this. Analysis and initial patch by David Rowley Author: David Rowley and Simon Riggs	2017-03-22 13:09:36 +00:00
Simon Riggs	1148e22a82	Teach xlogreader to follow timeline switches Uses page-based mechanism to ensure we’re using the correct timeline. Tests are included to exercise the functionality using a cold disk-level copy of the master that's started up as a replica with slots intact, but the intended use of the functionality is with later features. Craig Ringer, reviewed by Simon Riggs and Andres Freund	2017-03-22 07:05:12 +00:00
Robert Haas	d3cc37f1d8	Don't scan partitioned tables. Partitioned tables do not contain any data; only their unpartitioned descendents need to be scanned. However, the partitioned tables still need to be locked, even though they're not scanned. To make that work, Append and MergeAppend relations now need to carry a list of (unscanned) partitioned relations that must be locked, and InitPlan must lock all partitioned result relations. Aside from the obvious advantage of avoiding some work at execution time, this has two other advantages. First, it may improve the planner's decision-making in some cases since the empty relation might throw things off. Second, it paves the way to getting rid of the storage for partitioned tables altogether. Amit Langote, reviewed by me. Discussion: http://postgr.es/m/6837c359-45c4-8044-34d1-736756335a15@lab.ntt.co.jp	2017-03-21 09:48:04 -04:00
Andrew Dunstan	29bf501683	Add a direct function call mechanism using the caller's context. The current DirectFunctionCall functions use NULL as the flinfo in initializing the FunctionCallInfoData for the call. That means the called function has no fn_mcxt or fn_extra to work with, and attempting to do so will result in an access violation. These functions instead use the provided flinfo, which will usually be the caller's own flinfo. The caller needs to ensure that it doesn't use the fn_extra in way that is incompatible with the way the called function will use it. The called function should not rely on anything else in the provided context, as it will be relevant to the caller, not the callee. Original code from Tom Lane. Discussion: https://postgr.es/m/db2b70a4-78d7-294a-a315-8e7f506c5978@2ndQuadrant.com	2017-03-21 08:57:46 -04:00
Andrew Dunstan	b6fb534f10	Add IF NOT EXISTS for CREATE SERVER and CREATE USER MAPPING There is still some inconsistency with the error messages surrounding foreign servers. Some use the word "foreign" and some don't. My inclination is to remove all such uses of "foreign" on the basis that the CREATE/ALTER/DROP SERVER commands don't use the word. However, that is left for another day. In this patch I have kept to the existing usage in the affected commands, which omits "foreign". Anastasia Lubennikova, reviewed by Arthur Zakirov and Ashtosh Bapat. Discussion: http://postgr.es/m/7c2ab9b8-388a-1ce0-23a3-7acf2a0ed3c6@postgrespro.ru	2017-03-20 16:40:45 -04:00
Robert Haas	953477ca35	Fixes for single-page hash index vacuum. Clear LH_PAGE_HAS_DEAD_TUPLES during replay, similar to what gets done for btree. Update hashdesc.c for xl_hash_vacuum_one_page. Oversights in commit `6977b8b7f4` spotted by Amit Kapila. Patch by Ashutosh Sharma. Bump WAL version. The original patch to make hash indexes write-ahead logged probably should have done this, and the single page vacuuming patch probably should have done it again, but better late than never. Discussion: http://postgr.es/m/CAA4eK1Kd=mJ9xreovcsh0qMiAj-QqCphHVQ_Lfau1DR9oVjASQ@mail.gmail.com	2017-03-20 15:49:09 -04:00
Tom Lane	bc18126a6b	Add configure test to see if the C compiler has gcc-style computed gotos. We'll need this for the upcoming patch to speed up expression evaluation. Might as well push it now to see if it behaves sanely in the buildfarm. Andres Freund Discussion: https://postgr.es/m/20170320062511.hp5qeurtxrwsvfxr@alap3.anarazel.de	2017-03-20 13:35:26 -04:00
Tom Lane	17f8ffa1e3	Fix REFRESH MATERIALIZED VIEW to report activity to the stats collector. The non-concurrent code path for REFRESH MATERIALIZED VIEW failed to report its updates to the stats collector. This is bad since it means auto-analyze doesn't know there's any work to be done. Adjust it to report the refresh as a table truncate followed by insertion of an appropriate number of rows. Since a matview could contain more than INT_MAX rows, change the signature of pgstat_count_heap_insert() to accept an int64 rowcount. (The accumulator it's adding into is already int64, but existing callers could not insert more than a small number of rows at once, so the argument had been declared just "int n".) This is surely a bug fix, but changing pgstat_count_heap_insert()'s API seems too risky for the back branches. Given the lack of previous complaints, I'm not sure it's a big enough problem to justify a kluge solution that would avoid that. So, no back-patch, at least for now. Jim Mlodgenski, adjusted a bit by me Discussion: https://postgr.es/m/CAB_5SRchSz7-WmdO5szdiknG8Oj_GGqJytrk1KRd11yhcMs1KQ@mail.gmail.com	2017-03-18 17:49:39 -04:00
Robert Haas	249cf070e3	Create and use wait events for read, write, and fsync operations. Previous commits, notably `53be0b1add` and `6f3bd98ebf`, made it possible to see from pg_stat_activity when a backend was stuck waiting for another backend, but it's also fairly common for a backend to be stuck waiting for an I/O. Add wait events for those operations, too. Rushabh Lathia, with further hacking by me. Reviewed and tested by Michael Paquier, Amit Kapila, Rajkumar Raghuwanshi, and Rahila Syed. Discussion: http://postgr.es/m/CAGPqQf0LsYHXREPAZqYGVkDqHSyjf=KsD=k0GTVPAuzyThh-VQ@mail.gmail.com	2017-03-18 07:43:01 -04:00
Robert Haas	88e66d193f	Rename "pg_clog" directory to "pg_xact". Names containing the letters "log" sometimes confuse users into believing that only non-critical data is present. It is hoped this renaming will discourage ill-considered removals of transaction status data. Michael Paquier Discussion: http://postgr.es/m/CA+Tgmoa9xFQyjRZupbdEFuwUerFTvC6HjZq1ud6GYragGDFFgA@mail.gmail.com	2017-03-17 09:48:38 -04:00
Heikki Linnakangas	c6305a9c57	Allow plaintext 'password' authentication when user has a SCRAM verifier. Oversight in the main SCRAM patch.	2017-03-17 11:33:27 +02:00

1 2 3 4 5 ...

7870 commits