Commit graph

7870 commits

Author SHA1 Message Date
Tom Lane
50ee1c7462 Avoid searching for the target catcache in CatalogCacheIdInvalidate.
A test case provided by Mathieu Fenniak shows that the initial search for
the target catcache in CatalogCacheIdInvalidate consumes a very significant
amount of overhead in cases where cache invalidation is triggered but has
little useful work to do.  There is no good reason for that search to exist
at all, as the index array maintained by syscache.c allows direct lookup of
the catcache from its ID.  We just need a frontend function in syscache.c,
matching the division of labor for most other cache-accessing operations.

While there's more that can be done in this area, this patch alone reduces
the runtime of Mathieu's example by 2X.  We can hope that it offers some
useful benefit in other cases too, although usually cache invalidation
overhead is not such a striking fraction of the total runtime.

Back-patch to 9.4 where logical decoding was introduced.  It might be
worth going further back, but presently the only case we know of where
cache invalidation is really a significant burden is in logical decoding.
Also, older branches have fewer catcaches, reducing the possible benefit.

(Note: although this nominally changes catcache's API, we have always
documented CatalogCacheIdInvalidate as a private function, so I would
have little sympathy for an external module calling it directly.  So
backpatching should be fine.)

Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com
2017-05-12 18:17:29 -04:00
Tom Lane
928c4de309 Fix dependencies for extended statistics objects.
A stats object ought to have a dependency on each individual column
it reads, not the entire table.  Doing this honestly lets us get rid
of the hard-wired logic in RemoveStatisticsExt, which seems to have
been misguidedly modeled on RemoveStatistics; and it will be far easier
to extend to multiple tables later.

Also, add overlooked dependency on owner, and make the dependency on
schema be NORMAL like every other such dependency.

There remains some unfinished work here, which is to allow statistics
objects to be extension members.  That takes more effort than just
adding the dependency call, though, so I left it out for now.

initdb forced because this changes the set of pg_depend records that
should exist for a statistics object.

Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us
2017-05-12 16:26:31 -04:00
Alvaro Herrera
bc085205c8 Change CREATE STATISTICS syntax
Previously, we had the WITH clause in the middle of the command, where
you'd specify both generic options as well as statistic types.  Few
people liked this, so this commit changes it to remove the WITH keyword
from that clause and makes it accept statistic types only.  (We
currently don't have any generic options, but if we invent in the
future, we will gain a new WITH clause, probably at the end of the
command).

Also, the column list is now specified without parens, which makes the
whole command look more similar to a SELECT command.  This change will
let us expand the command to supporting expressions (not just columns
names) as well as multiple tables and their join conditions.

Tom added lots of code comments and fixed some parts of the CREATE
STATISTICS reference page, too; more changes in this area are
forthcoming.  He also fixed a potential problem in the alter_generic
regression test, reducing verbosity on a cascaded drop to avoid
dependency on message ordering, as we do in other tests.

Tom also closed a security bug: we documented that table ownership was
required in order to create a statistics object on it, but didn't
actually implement it.

Implement tab-completion for statistics objects.  This can stand some
more improvement.

Authors: Alvaro Herrera, with lots of cleanup by Tom Lane
Discussion: https://postgr.es/m/20170420212426.ltvgyhnefvhixm6i@alvherre.pgsql
2017-05-12 14:59:35 -03:00
Peter Eisentraut
d496a65790 Standardize "WAL location" terminology
Other previously used terms were "WAL position" or "log position".
2017-05-12 13:51:27 -04:00
Peter Eisentraut
c1a7f64b4a Replace "transaction log" with "write-ahead log"
This makes documentation and error messages match the renaming of "xlog"
to "wal" in APIs and file naming.
2017-05-12 11:52:43 -04:00
Peter Eisentraut
b807f59828 Rework the options syntax for logical replication commands
For CREATE/ALTER PUBLICATION/SUBSCRIPTION, use similar option style as
other statements that use a WITH clause for options.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
2017-05-12 08:57:49 -04:00
Simon Riggs
024711bb54 Lag tracking for logical replication
Lag tracking is called for each commit, but we introduce
a pacing delay to ensure we don't swamp the lag tracker.

Author: Petr Jelinek, with minor pacing delay code from me
2017-05-12 10:50:56 +01:00
Tom Lane
d10c626de4 Rename WAL-related functions and views to use "lsn" not "location".
Per discussion, "location" is a rather vague term that could refer to
multiple concepts.  "LSN" is an unambiguous term for WAL locations and
should be preferred.  Some function names, view column names, and function
output argument names used "lsn" already, but others used "location",
as well as yet other terms such as "wal_position".  Since we've already
renamed a lot of things in this area from "xlog" to "wal" for v10,
we may as well incur a bit more compatibility pain and make these names
all consistent.

David Rowley, minor additional docs hacking by me

Discussion: https://postgr.es/m/CAKJS1f8O0njDKe8ePFQ-LK5-EjwThsDws6ohJ-+c6nWK+oUxtg@mail.gmail.com
2017-05-11 11:49:59 -04:00
Alvaro Herrera
b66adb7b0c Revert "Permit dump/reload of not-too-large >1GB tuples"
This reverts commits fa2fa99552 and 42f50cb8fa.

While the functionality that was intended to be provided by these
commits is desired, the patch didn't actually solve as many of the
problematic situations as we hoped, and it created a bunch of its own
problems.  Since we're going to require more extensive changes soon for
other reasons and users have been working around these problems for a
long time already, there is no point in spending effort in fixing this
halfway measure.

Per complaint from Tom Lane.
Discussion: https://postgr.es/m/21407.1484606922@sss.pgh.pa.us

(Commit fa2fa99552 had already been reverted in branches 9.5 as
f858524ee4 and 9.6 as e9e44a0953, so this touches master only.
Commit 42f50cb8fa was not present in the older branches.)
2017-05-10 18:41:27 -03:00
Peter Eisentraut
489b96e80b Improve memory use in logical replication apply
Previously, the memory used by the logical replication apply worker for
processing messages would never be freed, so that could end up using a
lot of memory.  To improve that, change the existing ApplyContext memory
context to ApplyMessageContext and reset that after every
message (similar to MessageContext used elsewhere).  For consistency of
naming, rename the ApplyCacheContext to ApplyContext.

Author: Stas Kelvich <s.kelvich@postgrespro.ru>
2017-05-09 14:51:49 -04:00
Peter Eisentraut
013c1178fd Remove the NODROP SLOT option from DROP SUBSCRIPTION
It turned out this approach had problems, because a DROP command should
not have any options other than CASCADE and RESTRICT.  Instead, always
attempt to drop the slot if there is one configured, but also add an
ALTER SUBSCRIPTION action to set the slot to NONE.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/29431.1493730652@sss.pgh.pa.us
2017-05-09 10:20:42 -04:00
Tom Lane
da07596006 Further patch rangetypes_selfuncs.c's statistics slot management.
Values in a STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM slot are float8,
not of the type of the column the statistics are for.

This bug is at least partly the fault of sloppy specification comments
for get_attstatsslot()/free_attstatsslot(): the type OID they want is that
of the stavalues entries, not of the underlying column.  (I double-checked
other callers and they seem to get this right.)  Adjust the comments to be
more correct.

Per buildfarm.

Security: CVE-2017-7484
2017-05-08 15:03:14 -04:00
Peter Eisentraut
e2d4ef8de8 Add security checks to selectivity estimation functions
Some selectivity estimation functions run user-supplied operators over
data obtained from pg_statistic without security checks, which allows
those operators to leak pg_statistic data without having privileges on
the underlying tables.  Fix by checking that one of the following is
satisfied: (1) the user has table or column privileges on the table
underlying the pg_statistic data, or (2) the function implementing the
user-supplied operator is leak-proof.  If neither is satisfied, planning
will proceed as if there are no statistics available.

At least one of these is satisfied in most cases in practice.  The only
situations that are negatively impacted are user-defined or
not-leak-proof operators on a security-barrier view.

Reported-by: Robert Haas <robertmhaas@gmail.com>
Author: Peter Eisentraut <peter_e@gmx.net>
Author: Tom Lane <tgl@sss.pgh.pa.us>

Security: CVE-2017-7484
2017-05-08 09:26:32 -04:00
Heikki Linnakangas
eb61136dc7 Remove support for password_encryption='off' / 'plain'.
Storing passwords in plaintext hasn't been a good idea for a very long
time, if ever. Now seems like a good time to finally forbid it, since we're
messing with this in PostgreSQL 10 anyway.

Remove the CREATE/ALTER USER UNENCRYPTED PASSSWORD 'foo' syntax, since
storing passwords unencrypted is no longer supported. ENCRYPTED PASSWORD
'foo' is still accepted, but ENCRYPTED is now just a noise-word, it does
the same as just PASSWORD 'foo'.

Likewise, remove the --unencrypted option from createuser, but accept
--encrypted as a no-op for backward compatibility. AFAICS, --encrypted was
a no-op even before this patch, because createuser encrypted the password
before sending it to the server even if --encrypted was not specified. It
added the ENCRYPTED keyword to the SQL command, but since the password was
already in encrypted form, it didn't make any difference. The documentation
was not clear on whether that was intended or not, but it's moot now.

Also, while password_encryption='on' is still accepted as an alias for
'md5', it is now marked as hidden, so that it is not listed as an accepted
value in error hints, for example. That's not directly related to removing
'plain', but it seems better this way.

Reviewed by Michael Paquier

Discussion: https://www.postgresql.org/message-id/16e9b768-fd78-0b12-cfc1-7b6b7f238fde@iki.fi
2017-05-08 11:26:07 +03:00
Peter Eisentraut
086221cf6b Prevent panic during shutdown checkpoint
When the checkpointer writes the shutdown checkpoint, it checks
afterwards whether any WAL has been written since it started and throws
a PANIC if so.  At that point, only walsenders are still active, so one
might think this could not happen, but walsenders can also generate WAL,
for instance in BASE_BACKUP and certain variants of
CREATE_REPLICATION_SLOT.  So they can trigger this panic if such a
command is run while the shutdown checkpoint is being written.

To fix this, divide the walsender shutdown into two phases.  First, the
postmaster sends a SIGUSR2 signal to all walsenders.  The walsenders
then put themselves into the "stopping" state.  In this state, they
reject any new commands.  (For simplicity, we reject all new commands,
so that in the future we do not have to track meticulously which
commands might generate WAL.)  The checkpointer waits for all walsenders
to reach this state before proceeding with the shutdown checkpoint.
After the shutdown checkpoint is done, the postmaster sends
SIGINT (previously unused) to the walsenders.  This triggers the
existing shutdown behavior of sending out the shutdown checkpoint record
and then terminating.

Author: Michael Paquier <michael.paquier@gmail.com>
Reported-by: Fujii Masao <masao.fujii@gmail.com>
2017-05-05 10:31:42 -04:00
Heikki Linnakangas
0557a5dc2c Make SCRAM salts and nonces longer.
The salt is stored base64-encoded. With the old 10 bytes raw length, it was
always padded to 16 bytes after encoding. We might as well use 12 raw bytes
for the salt, and it's still encoded into 16 bytes.

Similarly for the random nonces, use a raw length that's divisible by 3, so
that there's no padding after base64 encoding. Make the nonces longer while
we're at it. 10 bytes was probably enough to prevent replay attacks, but
there's no reason to be skimpy here.

Per suggestion from Álvaro Hernández Tortosa.

Discussion: https://www.postgresql.org/message-id/df8c6e27-4d8e-5281-96e5-131a4e638fc8@8kdata.com
2017-05-05 10:02:13 +03:00
Heikki Linnakangas
e6e9c4da3a Misc cleanup of SCRAM code.
* Remove is_scram_verifier() function. It was unused.
* Fix sanitize_char() function, used in error messages on protocol
  violations, to print bytes >= 0x7F correctly.
* Change spelling of scram_MockSalt() function to be more consistent with
  the surroundings.
* Change a few more references to "server proof" to "server signature" that
  I missed in commit d981074c24.
2017-05-05 10:01:44 +03:00
Heikki Linnakangas
8f8b9be51f Add PQencryptPasswordConn function to libpq, use it in psql and createuser.
The new function supports creating SCRAM verifiers, in addition to md5
hashes. The algorithm is chosen based on password_encryption, by default.

This fixes the issue reported by Jeff Janes, that there was previously
no way to create a SCRAM verifier with "\password".

Michael Paquier and me

Discussion: https://www.postgresql.org/message-id/CAMkU%3D1wfBgFPbfAMYZQE78p%3DVhZX7nN86aWkp0QcCp%3D%2BKxZ%3Dbg%40mail.gmail.com
2017-05-03 11:19:07 +03:00
Tom Lane
23c6eb0336 Remove create_singleton_array(), hard-coding the case in its sole caller.
create_singleton_array() was not really as useful as we perhaps thought
when we added it.  It had never accreted more than one call site, and is
only saving a dozen lines of code at that one, which is considerably less
bulk than the function itself.  Moreover, because of its insistence on
using the caller's fn_extra cache space, it's arguably a coding hazard.
text_to_array_internal() does not currently use fn_extra in any other way,
but if it did it would be subtly broken, since the conflicting fn_extra
uses could be needed within a single query, in the seldom-tested case that
the field separator varies during the query.  The same objection seems
likely to apply to any other potential caller.

The replacement code is a bit uglier, because it hardwires knowledge of
the storage parameters of type TEXT, but it's not like we haven't got
dozens or hundreds of other places that do the same.  Uglier seems like
a good tradeoff for smaller, faster, and safer.

Per discussion with Neha Khatri.

Discussion: https://postgr.es/m/CAFO0U+_fS5SRhzq6uPG+4fbERhoA9N2+nPrtvaC9mmeWivxbsA@mail.gmail.com
2017-05-02 20:41:37 -04:00
Tom Lane
92a43e4857 Reduce semijoins with unique inner relations to plain inner joins.
If the inner relation can be proven unique, that is it can have no more
than one matching row for any row of the outer query, then we might as
well implement the semijoin as a plain inner join, allowing substantially
more freedom to the planner.  This is a form of outer join strength
reduction, but it can't be implemented in reduce_outer_joins() because
we don't have enough info about the individual relations at that stage.
Instead do it much like remove_useless_joins(): once we've built base
relations, we can make another pass over the SpecialJoinInfo list and
get rid of any entries representing reducible semijoins.

This is essentially a followon to the inner-unique patch (commit 9c7f5229a)
and makes use of the proof machinery that that patch created.  We need only
minor refactoring of innerrel_is_unique's API to support this usage.

Per performance complaint from Teodor Sigaev.

Discussion: https://postgr.es/m/f994fc98-389f-4a46-d1bc-c42e05cb43ed@sigaev.ru
2017-05-01 14:53:42 -04:00
Peter Eisentraut
9414e41ea7 Fix logical replication launcher wake up and reset
After the logical replication launcher was told to wake up at
commit (for example, by a CREATE SUBSCRIPTION command), the flag to wake
up was not reset, so it would be woken up at every following commit as
well.  So fix that by resetting the flag.

Also, we don't need to wake up anything if the transaction was rolled
back.  Just reset the flag in that case.

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-by: Fujii Masao <masao.fujii@gmail.com>
2017-05-01 10:18:09 -04:00
Robert Haas
e180c8aa8c Fire per-statement triggers on partitioned tables.
Even though no actual tuples are ever inserted into a partitioned
table (the actual tuples are in the partitions, not the partitioned
table itself), we still need to have a ResultRelInfo for the
partitioned table, or per-statement triggers won't get fired.

Amit Langote, per a report from Rajkumar Raghuwanshi.  Reviewed by me.

Discussion: http://postgr.es/m/CAKcux6%3DwYospCRY2J4XEFuVy0L41S%3Dfic7rmkbsU-GXhhSbmBg%40mail.gmail.com
2017-05-01 08:23:01 -04:00
Robert Haas
504c2205ab Fix crash when partitioned column specified twice.
Amit Langote, reviewed by Beena Emerson

Discussion: http://postgr.es/m/6ed23d3d-c09d-4cbc-3628-0a8a32f750f4@lab.ntt.co.jp
2017-04-28 13:52:17 -04:00
Heikki Linnakangas
d981074c24 Misc SCRAM code cleanups.
* Move computation of SaltedPassword to a separate function from
  scram_ClientOrServerKey(). This saves a lot of cycles in libpq, by
  computing SaltedPassword only once per authentication. (Computing
  SaltedPassword is expensive by design.)

* Split scram_ClientOrServerKey() into two functions. Improves
  readability, by making the calling code less verbose.

* Rename "server proof" to "server signature", to better match the
  nomenclature used in RFC 5802.

* Rename SCRAM_SALT_LEN to SCRAM_DEFAULT_SALT_LEN, to make it more clear
  that the salt can be of any length, and the constant only specifies how
  long a salt we use when we generate a new verifier. Also rename
  SCRAM_ITERATIONS_DEFAULT to SCRAM_DEFAULT_ITERATIONS, for consistency.

These things caught my eye while working on other upcoming changes.
2017-04-28 15:22:38 +03:00
Andres Freund
56e19d938d Don't use on-disk snapshots for exported logical decoding snapshot.
Logical decoding stores historical snapshots on disk, so that logical
decoding can restart without having to reconstruct a snapshot from
scratch (for which the resources are not guaranteed to be present
anymore).  These serialized snapshots were also used when creating a
new slot via the walsender interface, which can export a "full"
snapshot (i.e. one that can read all tables, not just catalog ones).

The problem is that the serialized snapshots are only useful for
catalogs and not for normal user tables.  Thus the use of such a
serialized snapshot could result in an inconsistent snapshot being
exported, which could lead to queries returning wrong data.  This
would only happen if logical slots are created while another logical
slot already exists.

Author: Petr Jelinek
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
Backport: 9.4, where logical decoding was introduced.
2017-04-27 15:29:15 -07:00
Andres Freund
2bef06d516 Preserve required !catalog tuples while computing initial decoding snapshot.
The logical decoding machinery already preserved all the required
catalog tuples, which is sufficient in the course of normal logical
decoding, but did not guarantee that non-catalog tuples were preserved
during computation of the initial snapshot when creating a slot over
the replication protocol.

This could cause a corrupted initial snapshot being exported.  The
time window for issues is usually not terribly large, but on a busy
server it's perfectly possible to it hit it.  Ongoing decoding is not
affected by this bug.

To avoid increased overhead for the SQL API, only retain additional
tuples when a logical slot is being created over the replication
protocol.  To do so this commit changes the signature of
CreateInitDecodingContext(), but it seems unlikely that it's being
used in an extension, so that's probably ok.

In a drive-by fix, fix handling of
ReplicationSlotsComputeRequiredXmin's already_locked argument, which
should only apply to ProcArrayLock, not ReplicationSlotControlLock.

Reported-By: Erik Rijkers
Analyzed-By: Petr Jelinek
Author: Petr Jelinek, heavily editorialized by Andres Freund
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/9a897b86-46e1-9915-ee4c-da02e4ff6a95@2ndquadrant.com
Backport: 9.4, where logical decoding was introduced.
2017-04-27 13:13:36 -07:00
Simon Riggs
49e9281549 Rework handling of subtransactions in 2PC recovery
The bug fixed by 0874d4f3e1
caused us to question and rework the handling of
subtransactions in 2PC during and at end of recovery.
Patch adds checks and tests to ensure no further bugs.

This effectively removes the temporary measure put in place
by 546c13e11b.

Author: Simon Riggs
Reviewed-by: Tom Lane, Michael Paquier
Discussion: http://postgr.es/m/CANP8+j+vvXmruL_i2buvdhMeVv5TQu0Hm2+C5N+kdVwHJuor8w@mail.gmail.com
2017-04-27 14:41:22 +02:00
Peter Eisentraut
de43897122 Fix various concurrency issues in logical replication worker launching
The code was originally written with assumption that launcher is the
only process starting the worker.  However that hasn't been true since
commit 7c4f52409 which failed to modify the worker management code
adequately.

This patch adds an in_use field to the LogicalRepWorker struct to
indicate whether the worker slot is being used and uses proper locking
everywhere this flag is set or read.

However if the parent process dies while the new worker is starting and
the new worker fails to attach to shared memory, this flag would never
get cleared.  We solve this rare corner case by adding a sort of garbage
collector for in_use slots.  This uses another field in the
LogicalRepWorker struct named launch_time that contains the time when
the worker was started.  If any request to start a new worker does not
find free slot, we'll check for workers that were supposed to start but
took too long to actually do so, and reuse their slot.

In passing also fix possible race conditions when stopping a worker that
hasn't finished starting yet.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Fujii Masao <masao.fujii@gmail.com>
2017-04-26 10:45:59 -04:00
Tom Lane
64925603c9 Revert "Use pselect(2) not select(2), if available, to wait in postmaster's loop."
This reverts commit 81069a9efc.

Buildfarm results suggest that some platforms have versions of pselect(2)
that are not merely non-atomic, but flat out non-functional.  Revert the
use-pselect patch to confirm this diagnosis (and exclude the no-SA_RESTART
patch as the source of trouble).  If it's so, we should probably look into
blacklisting specific platforms that have broken pselect.

Discussion: https://postgr.es/m/9696.1493072081@sss.pgh.pa.us
2017-04-24 18:29:03 -04:00
Tom Lane
81069a9efc Use pselect(2) not select(2), if available, to wait in postmaster's loop.
Traditionally we've unblocked signals, called select(2), and then blocked
signals again.  The code expects that the select() will be cancelled with
EINTR if an interrupt occurs; but there's a race condition, which is that
an already-pending signal will be delivered as soon as we unblock, and then
when we reach select() there will be nothing preventing it from waiting.
This can result in a long delay before we perform any action that
ServerLoop was supposed to have taken in response to the signal.  As with
the somewhat-similar symptoms fixed by commit 893902085, the main practical
problem is slow launching of parallel workers.  The window for trouble is
usually pretty short, corresponding to one iteration of ServerLoop; but
it's not negligible.

To fix, use pselect(2) in place of select(2) where available, as that's
designed to solve exactly this problem.  Where not available, we continue
to use the old way, and are no worse off than before.

pselect(2) has been required by POSIX since about 2001, so most modern
platforms should have it.  A bigger portability issue is that some
implementations are said to be non-atomic, ie pselect() isn't really
any different from unblock/select/reblock.  Still, we're no worse off
than before on such a platform.

There is talk of rewriting the postmaster to use a WaitEventSet and
not do signal response work in signal handlers, at which point this
could be reverted, since we'd be using a self-pipe to solve the race
condition.  But that's not happening before v11 at the earliest.

Back-patch to 9.6.  The problem exists much further back, but the
worst symptom arises only in connection with parallel query, so it
does not seem worth taking any portability risks in older branches.

Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
2017-04-24 14:03:14 -04:00
Tom Lane
8939020853 Run the postmaster's signal handlers without SA_RESTART.
The postmaster keeps signals blocked everywhere except while waiting
for something to happen in ServerLoop().  The code expects that the
select(2) will be cancelled with EINTR if an interrupt occurs; without
that, followup actions that should be performed by ServerLoop() itself
will be delayed.  However, some platforms interpret the SA_RESTART
signal flag as meaning that they should restart rather than cancel
the select(2).  Worse yet, some of them restart it with the original
timeout delay, meaning that a steady stream of signal interrupts can
prevent ServerLoop() from iterating at all if there are no incoming
connection requests.

Observable symptoms of this, on an affected platform such as HPUX 10,
include extremely slow parallel query startup (possibly as much as
30 seconds) and failure to update timestamps on the postmaster's sockets
and lockfiles when no new connections arrive for a long time.

We can fix this by running the postmaster's signal handlers without
SA_RESTART.  That would be quite a scary change if the range of code
where signals are accepted weren't so tiny, but as it is, it seems
safe enough.  (Note that postmaster children do, and must, reset all
the handlers before unblocking signals; so this change should not
affect any child process.)

There is talk of rewriting the postmaster to use a WaitEventSet and
not do signal response work in signal handlers, at which point it might
be appropriate to revert this patch.  But that's not happening before
v11 at the earliest.

Back-patch to 9.6.  The problem exists much further back, but the
worst symptom arises only in connection with parallel query, so it
does not seem worth taking any portability risks in older branches.

Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
2017-04-24 13:00:30 -04:00
Fujii Masao
cbc2270e3f Get rid of extern declarations of non-existent functions.
Those extern declartions were mistakenly added by commit 7c4f52409.

Author: Petr Jelinek
2017-04-25 01:31:42 +09:00
Andres Freund
b182a4ae2f Don't include sys/poll.h anymore.
poll.h is mandated by Single Unix Spec v2, the usual baseline for
postgres on unix.  None of the unixoid buildfarms animals has
sys/poll.h but not poll.h.  Therefore there's not much point to test
for sys/poll.h's existence and include it optionally.

Author: Andres Freund, per suggestion from Tom Lane
Discussion: https://postgr.es/m/20505.1492723662@sss.pgh.pa.us
2017-04-23 16:11:35 -07:00
Heikki Linnakangas
68e61ee72e Change the on-disk format of SCRAM verifiers to conform to RFC 5803.
It doesn't make any immediate difference to PostgreSQL, but might as well
follow the standard, since one exists. (I looked at RFC 5803 earlier, but
didn't fully understand it back then.)

The new format uses Base64 instead of hex to encode StoredKey and
ServerKey, which makes the verifiers slightly smaller. Using the same
encoding for the salt and the keys also means that you only need one
encoder/decoder instead of two. Although we have code in the backend to
do both, we are talking about teaching libpq how to create SCRAM verifiers
for PQencodePassword(), and libpq doesn't currently have any code for hex
encoding.

Bump catversion, because this renders any existing SCRAM verifiers in
pg_authid invalid.

Discussion: https://www.postgresql.org/message-id/351ba574-85ea-d9b8-9689-8c928dd0955d@iki.fi
2017-04-21 22:51:57 +03:00
Fujii Masao
88b0a31926 Mark some columns in pg_subscription as NOT NULL.
In pg_subscription, subconninfo, subslotname, subsynccommit and
subpublications are expected not to be NULL. Therefore this patch
adds BKI_FORCE_NOT_NULL markings to them.

This patch is basically unnecessary unless the code has a bug which
wrongly sets either of those columns to NULL. But it's good to have
this as a safeguard.

Author: Masahiko Sawada
Reviewed-by: Kyotaro Horiguchi
Reported-by: Fujii Masao
Discussion: http://postgr.es/m/CAHGQGwFDWh_Qr-q_GEMpD+qH=vYPMdVqw=ZOSY3kX_Pna9R9SA@mail.gmail.com
2017-04-20 23:35:30 +09:00
Tom Lane
39151781c8 Fix testing of parallel-safety of SubPlans.
is_parallel_safe() supposed that the only relevant property of a SubPlan
was the parallel safety of the referenced subplan tree.  This is wrong:
the testexpr or args subtrees might contain parallel-unsafe stuff, as
demonstrated by the test case added here.  However, just recursing into the
subtrees fails in a different way: we'll typically find PARAM_EXEC Params
representing the subplan's output columns in the testexpr.  The previous
coding supposed that any Param must be treated as parallel-restricted, so
that a naive attempt at fixing this disabled parallel pushdown of SubPlans
altogether.  We must instead determine, for any visited Param, whether it
is one that would be computed by a surrounding SubPlan node; if so, it's
safe to push down along with the SubPlan node.

We might later be able to extend this logic to cope with Params used for
correlated subplans and other cases; but that's a task for v11 or beyond.

Tom Lane and Amit Kapila

Discussion: https://postgr.es/m/7064.1492022469@sss.pgh.pa.us
2017-04-18 15:43:56 -04:00
Tom Lane
e240a65c7d Provide an error cursor for "can't call an SRF here" errors.
Since it appears that v10 is going to move the goalposts by some amount
in terms of where you can and can't invoke set-returning functions,
arrange for the executor's "set-valued function called in context that
cannot accept a set" errors to include a syntax position if possible,
pointing to the specific SRF that can't be called where it's located.

The main bit of infrastructure needed for this is to make the query source
text accessible in the executor; but it turns out that commit 4c728f382
already did that.  We just need a new function executor_errposition()
modeled on parser_errposition(), and we're ready to rock.

While experimenting with this, I noted that the error position wasn't
properly reported if it occurred in a plpgsql FOR-over-query loop,
which turned out to be because SPI_cursor_open_internal wasn't providing
an error context callback during PortalStart.  Fix that.

There's a whole lot more that could be done with this infrastructure
now that it's there, but this is not the right time in the development
cycle for that sort of work.  Hence, resist the temptation to plaster
executor_errposition() calls everywhere ... for the moment.

Discussion: https://postgr.es/m/5263.1492471571@sss.pgh.pa.us
2017-04-18 13:21:08 -04:00
Fujii Masao
280c53ecfb A collection of small fixes for logical replication.
* Be sure to reset the launcher's pid (LogicalRepCtx->launcher_pid) to 0
  even when the launcher emits an error.

* Declare ApplyLauncherWakeup() as a static function because it's called
  only in launcher.c.

* Previously IsBackendPId() was used to check whether the launcher's pid
  was valid. IsBackendPid() was necessary because there was the bug where
  the launcher's pid was not reset to 0. But now it's fixed, so IsBackendPid()
  is not necessary and this patch removes it.

Author: Masahiko Sawada
Reviewed-by: Kyotaro Horiguchi
Reported-by: Fujii Masao
Discussion: http://postgr.es/m/CAHGQGwFDWh_Qr-q_GEMpD+qH=vYPMdVqw=ZOSY3kX_Pna9R9SA@mail.gmail.com
2017-04-19 02:16:34 +09:00
Heikki Linnakangas
c727f120ff Rename "scram" to "scram-sha-256" in pg_hba.conf and password_encryption.
Per discussion, plain "scram" is confusing because we actually implement
SCRAM-SHA-256 rather than the original SCRAM that uses SHA-1 as the hash
algorithm. If we add support for SCRAM-SHA-512 or some other mechanism in
the SCRAM family in the future, that would become even more confusing.

Most of the internal files and functions still use just "scram" as a
shorthand for SCRMA-SHA-256, but I did change PASSWORD_TYPE_SCRAM to
PASSWORD_TYPE_SCRAM_SHA_256, as that could potentially be used by 3rd
party extensions that hook into the password-check hook.

Michael Paquier did this in an earlier version of the SCRAM patch set
already, but I didn't include that in the version that was committed.

Discussion: https://www.postgresql.org/message-id/fde71ff1-5858-90c8-99a9-1c2427e7bafb@iki.fi
2017-04-18 14:50:50 +03:00
Alvaro Herrera
ee6922112e Rename columns in new pg_statistic_ext catalog
The new catalog reused a column prefix "sta" from pg_statistic, but this
is undesirable, so change the catalog to use prefix "stx" instead.
Also, rename the column that lists enabled statistic kinds as "stxkind"
rather than "enabled".

Discussion: https://postgr.es/m/CAKJS1f_2t5jhSN7huYRFH3w3rrHfG2QU7hiUHsu-Vdjd1rYT3w@mail.gmail.com
2017-04-17 18:34:29 -03:00
Tom Lane
bfba563bc5 Fix erroneous cross-reference in comment.
Seems to have been introduced in commit c219d9b0a.  I think there indeed
was a "tupbasics.h" in some early drafts of that refactoring, but it
didn't survive into the committed version.

Amit Kapila
2017-04-15 14:22:26 -04:00
Tom Lane
32470825d3 Avoid passing function pointers across process boundaries.
We'd already recognized that we can't pass function pointers across process
boundaries for functions in loadable modules, since a shared library could
get loaded at different addresses in different processes.  But actually the
practice doesn't work for functions in the core backend either, if we're
using EXEC_BACKEND.  This is the cause of recent failures on buildfarm
member culicidae.  Switch to passing a string function name in all cases.

Something like this needs to be back-patched into 9.6, but let's see
if the buildfarm likes it first.

Petr Jelinek, with a bunch of basically-cosmetic adjustments by me

Discussion: https://postgr.es/m/548f9c1d-eafa-e3fa-9da8-f0cc2f654e60@2ndquadrant.com
2017-04-14 23:50:16 -04:00
Tom Lane
2040bb4a0b Clean up manipulations of hash indexes' hasho_flag field.
Standardize on testing a hash index page's type by doing
	(opaque->hasho_flag & LH_PAGE_TYPE) == LH_xxx_PAGE
Various places were taking shortcuts like
	opaque->hasho_flag & LH_BUCKET_PAGE
which while not actually wrong, is still bad practice because
it encourages use of
	opaque->hasho_flag & LH_UNUSED_PAGE
which *is* wrong (LH_UNUSED_PAGE == 0, so the above is constant false).
hash_xlog.c's hash_mask() contained such an incorrect test.

This also ensures that we mask out the additional flag bits that
hasho_flag has accreted since 9.6.  pgstattuple's pgstat_hash_page(),
for one, was failing to do that and was thus actively broken.

Also fix assorted comments that hadn't been updated to reflect the
extended usage of hasho_flag, and fix some macros that were testing
just "(hasho_flag & bit)" to use the less dangerous, project-approved
form "((hasho_flag & bit) != 0)".

Coverity found the bug in hash_mask(); I noted the one in
pgstat_hash_page() through code reading.
2017-04-14 17:04:25 -04:00
Peter Eisentraut
67c2def11d Catversion bump
for commit 887227a1cc
2017-04-14 14:24:01 -04:00
Peter Eisentraut
887227a1cc Add option to modify sync commit per subscription
This also changes default behaviour of subscription workers to
synchronous_commit = off.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
2017-04-14 13:58:46 -04:00
Peter Eisentraut
d04eac1148 Make header self-contained
Add necessary include files for things used in the header.  (signal.h
needed for sig_atomic_t.)
2017-04-13 21:47:24 -04:00
Heikki Linnakangas
4f3b87ab78 Improve the SASL authentication protocol.
This contains some protocol changes to SASL authentiation (which is new
in v10):

* For future-proofing, in the AuthenticationSASL message that begins SASL
  authentication, provide a list of SASL mechanisms that the server
  supports, for the client to choose from. Currently, it's always just
  SCRAM-SHA-256.

* Add a separate authentication message type for the final server->client
  SASL message, which the client doesn't need to respond to. This makes
  it unambiguous whether the client is supposed to send a response or not.
  The SASL mechanism should know that anyway, but better to be explicit.

Also, in the server, support clients that don't send an Initial Client
response in the first SASLInitialResponse message. The server is supposed
to first send an empty request in that case, to which the client will
respond with the data that usually comes in the Initial Client Response.
libpq uses the Initial Client Response field and doesn't need this, and I
would assume any other sensible implementation to use Initial Client
Response, too, but let's follow the SASL spec.

Improve the documentation on SASL authentication in protocol. Add a
section describing the SASL message flow, and some details on our
SCRAM-SHA-256 implementation.

Document the different kinds of PasswordMessages that the frontend sends
in different phases of SASL authentication, as well as GSS/SSPI
authentication as separate message formats. Even though they're all 'p'
messages, and the exact format depends on the context, describing them as
separate message formats makes the documentation more clear.

Reviewed by Michael Paquier and Álvaro Hernández Tortosa.

Discussion: https://www.postgresql.org/message-id/CAB7nPqS-aFg0iM3AQOJwKDv_0WkAedRjs1W2X8EixSz+sKBXCQ@mail.gmail.com
2017-04-13 19:34:16 +03:00
Alvaro Herrera
27bcc372b1 Catversion bump forgotten in previous commit 2017-04-13 11:54:28 -03:00
Tom Lane
16ebab6886 Avoid transferring parallel-unsafe subplans to parallel workers.
Commit 5e6d8d2bb allowed parallel workers to execute parallel-safe
subplans, but it transmitted the query's entire list of subplans to
the worker(s).  Since execMain.c blindly does ExecInitNode and later
ExecEndNode on every list element, this resulted in parallel-unsafe plan
nodes nonetheless getting started up and shut down in parallel workers.
That seems mostly harmless as far as core plan node types go (but
maybe not so much for Gather?).  But it resulted in postgres_fdw
opening and then closing extra remote connections, and it's likely
that other non-parallel-safe FDWs or custom scan providers would have
worse reactions.

To fix, just make ExecSerializePlan replace parallel-unsafe subplans
with NULLs in the cut-down plan tree that it transmits to workers.
This relies on ExecInitNode and ExecEndNode to do nothing on NULL
input, but they do anyway.  If anything else is touching the dropped
subplans in a parallel worker, that would be a bug to be fixed.
(This thus provides a strong guarantee that we won't try to do
something with a parallel-unsafe subplan in a worker.)

This is, I think, the last fix directly occasioned by Andreas Seltenreich's
bug report of a few days ago.

Tom Lane and Amit Kapila

Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de
2017-04-12 16:07:00 -04:00
Tom Lane
003d80f3df Mark finished Plan nodes with parallel_safe flags.
We'd managed to avoid doing this so far, but it seems pretty obvious
that it would be forced on us some day, and this is much the cleanest
way of approaching the open problem that parallel-unsafe subplans are
being transmitted to parallel workers.  Anyway there's no space cost
due to alignment considerations, and the time cost is pretty minimal
since we're just copying the flag from the corresponding Path node.
(At least in most cases ... some of the klugier spots in createplan.c
have to work a bit harder.)

In principle we could perhaps get rid of SubPlan.parallel_safe,
but I thought it better to keep that in case there are reasons to
consider a SubPlan unsafe even when its child plan is parallel-safe.

This patch doesn't actually do anything with the new flags, but
I thought I'd commit it separately anyway.

Note: although this touches outfuncs/readfuncs, there's no need for
a catversion bump because Plan trees aren't stored on disk.

Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de
2017-04-12 15:13:34 -04:00
Robert Haas
6599c9ac33 Add an Assert() to max_parallel_workers enforcement.
To prevent future bugs along the lines of the one corrected by commit
8ff518699f, or find any that remain
in the current code, add an Assert() that the difference between
parallel_register_count and parallel_terminate_count is in a sane
range.

Kuntal Ghosh, with considerable tidying-up by me, per a suggestion
from Neha Khatri.  Reviewed by Tomas Vondra.

Discussion: http://postgr.es/m/CAFO0U+-E8yzchwVnvn5BeRDPgX2z9vZUxQ8dxx9c0XFGBC7N1Q@mail.gmail.com
2017-04-11 13:03:44 -04:00
Fujii Masao
ff7bce1743 Add max_sync_workers_per_subscription to postgresql.conf.sample.
This commit also does

- add REPLICATION_SUBSCRIBERS into config_group
- mark max_logical_replication_workers and max_sync_workers_per_subscription
  as REPLICATION_SUBSCRIBERS parameters
- move those parameters into "Subscribers" section in postgresql.conf.sample

Author: Masahiko Sawada, Petr Jelinek and me
Reported-by: Masahiko Sawada
Discussion: http://postgr.es/m/CAD21AoAonSCoa=v=87ZO3vhfUZA1k_E2XRNHTt=xioWGUa+0ug@mail.gmail.com
2017-04-12 00:10:54 +09:00
Magnus Hagander
a4777f3556 Remove symbol WIN32_ONLY_COMPILER
This used to mean "Visual C++ except in those parts where Borland C++
was supported where it meant one of those". Now that we don't support
Borland C++ anymore, simplify by using _MSC_VER which is the normal way
to detect Visual C++.
2017-04-11 15:22:21 +02:00
Magnus Hagander
6da56f3f84 Remove support for bcc and msvc standalone libpq builds
This removes the support for building just libpq using Borland C++ or
Visual C++. This has not worked properly for years, and given the number
of complaints it's clearly not worth the maintenance burden.

Building libpq using the standard MSVC build system is of course still
supported, along with mingw.
2017-04-11 15:22:21 +02:00
Tom Lane
8f0530f580 Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type.  For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))".  Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.

As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.

Patch by me, after an idea of Andrew Gierth's.

Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 13:51:53 -04:00
Peter Eisentraut
26ad194cb0 Support configuration reload in logical replication workers
Author: Michael Paquier <michael.paquier@gmail.com>
Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Fujii Masao <masao.fujii@gmail.com>
2017-04-10 13:42:21 -04:00
Robert Haas
c0a8ae7be3 Fix reporting of violations in ExecConstraints, again.
We decided in f1b4c771ea to pass the
original slot to ExecConstraints(), but that breaks when there are
BEFORE ROW triggers involved.  So we need to do reverse-map the tuples
back to the original descriptor instead, as Amit originally proposed.

Amit Langote, reviewed by Ashutosh Bapat.  One overlooked comment
fixed by me.

Discussion: http://postgr.es/m/b3a17254-6849-e542-2353-bde4e880b6a4@lab.ntt.co.jp
2017-04-10 12:20:08 -04:00
Tom Lane
511540dadf Move isolationtester's is-blocked query into C code for speed.
Commit 4deb41381 modified isolationtester's query to see whether a
session is blocked to also check for waits occurring in GetSafeSnapshot.
However, it did that in a way that enormously increased the query's
runtime under CLOBBER_CACHE_ALWAYS, causing the buildfarm members
that use that to run about four times slower than before, and in some
cases fail entirely.  To fix, push the entire logic into a dedicated
backend function.  This should actually reduce the CLOBBER_CACHE_ALWAYS
runtime from what it was previously, though I've not checked that.

In passing, expose a SQL function to check for safe-snapshot blockage,
comparable to pg_blocking_pids.  This is more or less free given the
infrastructure built to solve the other problem, so we might as well.

Thomas Munro

Discussion: https://postgr.es/m/20170407165749.pstcakbc637opkax@alap3.anarazel.de
2017-04-10 10:26:54 -04:00
Kevin Grittner
c63172d60f Add GUCs for predicate lock promotion thresholds.
Defaults match the fixed behavior of prior releases, but now DBAs
have better options to tune serializable workloads.

It might be nice to be able to set this per relation, but that part
will need to wait for another release.

Author: Dagfinn Ilmari Mannsåker
2017-04-07 21:38:05 -05:00
Tom Lane
9c7f5229ad Optimize joins when the inner relation can be proven unique.
If there can certainly be no more than one matching inner row for a given
outer row, then the executor can move on to the next outer row as soon as
it's found one match; there's no need to continue scanning the inner
relation for this outer row.  This saves useless scanning in nestloop
and hash joins.  In merge joins, it offers the opportunity to skip
mark/restore processing, because we know we have not advanced past the
first possible match for the next outer row.

Of course, the devil is in the details: the proof of uniqueness must
depend only on joinquals (not otherquals), and if we want to skip
mergejoin mark/restore then it must depend only on merge clauses.
To avoid adding more planning overhead than absolutely necessary,
the present patch errs in the conservative direction: there are cases
where inner_unique or skip_mark_restore processing could be used, but
it will not do so because it's not sure that the uniqueness proof
depended only on "safe" clauses.  This could be improved later.

David Rowley, reviewed and rather heavily editorialized on by me

Discussion: https://postgr.es/m/CAApHDvqF6Sw-TK98bW48TdtFJ+3a7D2mFyZ7++=D-RyPsL76gw@mail.gmail.com
2017-04-07 22:20:13 -04:00
Andres Freund
f13a9121f9 Fix issues in e8fdbd58fe.
When the 64bit atomics simulation is in use, we can't necessarily
guarantee the correct alignment of the atomics due to lack of compiler
support for doing so- that's fine from a safety perspective, because
everything is protected by a lock, but we asserted the alignment in
all cases.  Weaken them.  Per complaint from Alvaro Herrera.

My #ifdefery for PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY wasn't
sufficient. Fix that.  Per complaint from Alexander Korotkov.
2017-04-07 17:09:03 -07:00
Alvaro Herrera
8bf74967da Reduce the number of pallocs() in BRIN
Instead of allocating memory in brin_deform_tuple and brin_copy_tuple
over and over during a scan, allow reuse of previously allocated memory.
This is said to make for a measurable performance improvement.

Author: Jinyu Zhang, Álvaro Herrera
Reviewed by: Tomas Vondra
Discussion: https://postgr.es/m/495deb78.4186.1500dacaa63.Coremail.beijing_pg@163.com
2017-04-07 19:08:43 -03:00
Andres Freund
e8fdbd58fe Improve 64bit atomics support.
When adding atomics back in b64d92f1a, I added 64bit support as
optional; there wasn't yet a direct user in sight.  That turned out to
be a bit short-sighted, it'd already have been useful a number of times.

Add a fallback implementation of 64bit atomics, just like the one we
have for 32bit atomics.

Additionally optimize reads/writes to 64bit on a number of platforms
where aligned writes of that size are atomic. This can now be tested
with PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY.

Author: Andres Freund
Reviewed-By: Amit Kapila
Discussion: https://postgr.es/m/20160330230914.GH13305@awork2.anarazel.de
2017-04-07 14:48:11 -07:00
Peter Eisentraut
0cb2e51992 Avoid using a C++ keyword in header file
per cpluspluscheck
2017-04-07 16:32:02 -04:00
Alvaro Herrera
817cb10013 Fix new BRIN desummarize WAL record
The WAL-writing piece was forgetting to set the pages-per-range value.
Also, fix the declared type of struct member heapBlk, which I mistakenly
set as OffsetNumber rather than BlockNumber.

Problem was introduced by commit c655899ba9 (April 1st).  Any system
that tries to replay the new WAL record written before this fix is
likely to die on replay and require pg_resetwal.

Reported by Tom Lane.
Discussion: https://postgr.es/m/20191.1491524824@sss.pgh.pa.us
2017-04-07 17:11:56 -03:00
Robert Haas
d4116a7719 Add ProcArrayGroupUpdate wait event.
Discussion: http://postgr.es/m/CA+TgmobgWHcXDcChX2+BqJDk2dkPVF85ZrJFhUyHHQmw8diTpA@mail.gmail.com
2017-04-07 13:41:47 -04:00
Heikki Linnakangas
60f11b87a2 Use SASLprep to normalize passwords for SCRAM authentication.
An important step of SASLprep normalization, is to convert the string to
Unicode normalization form NFKC. Unicode normalization requires a fairly
large table of character decompositions, which is generated from data
published by the Unicode consortium. The script to generate the table is
put in src/common/unicode, as well test code for the normalization.
A pre-generated version of the tables is included in src/include/common,
so you don't need the code in src/common/unicode to build PostgreSQL, only
if you wish to modify the normalization tables.

The SASLprep implementation depends on the UTF-8 functions from
src/backend/utils/mb/wchar.c. So to use it, you must also compile and link
that. That doesn't change anything for the current users of these
functions, the backend and libpq, as they both already link with wchar.o.
It would be good to move those functions into a separate file in
src/commmon, but I'll leave that for another day.

No documentation changes included, because there is no details on the
SCRAM mechanism in the docs anyway. An overview on that in the protocol
specification would probably be good, even though SCRAM is documented in
detail in RFC5802. I'll write that as a separate patch. An important thing
to mention there is that we apply SASLprep even on invalid UTF-8 strings,
to support other encodings.

Patch by Michael Paquier and me.

Discussion: https://www.postgresql.org/message-id/CAB7nPqSByyEmAVLtEf1KxTRh=PWNKiWKEKQR=e1yGehz=wbymQ@mail.gmail.com
2017-04-07 14:56:05 +03:00
Simon Riggs
ac2b095088 Reset API of clause_selectivity()
Discussion: https://postgr.es/m/CAKJS1f9yurJQW9pdnzL+rmOtsp2vOytkpXKGnMFJEO-qz5O5eA@mail.gmail.com
2017-04-06 19:10:51 -04:00
Andres Freund
fa117ee403 Allow avoiding tuple copy within tuplesort_gettupleslot().
Add a "copy" argument to make it optional to receive a copy of caller
tuple that is safe to use following a subsequent manipulating of
tuplesort's state.  This is a performance optimization.  Most existing
tuplesort_gettupleslot() callers are made to opt out of copying.
Existing callers that happen to rely on the validity of tuple memory
beyond subsequent manipulations of the tuplesort request their own
copy.

This brings tuplesort_gettupleslot() in line with
tuplestore_gettupleslot().  In the future, a "copy"
tuplesort_getdatum() argument may be added, that similarly allows
callers to opt out of receiving their own copy of tuple.

In passing, clarify assumptions that callers of other tuplesort fetch
routines may make about tuple memory validity, per gripe from Tom
Lane.

Author: Peter Geoghegan
Discussion: CAM3SWZQWZZ_N=DmmL7tKy_OUjGH_5mN=N=A6h7kHyyDvEhg2DA@mail.gmail.com
2017-04-06 14:48:59 -07:00
Alvaro Herrera
7e534adcdc Fix BRIN cost estimation
The original code was overly optimistic about the cost of scanning a
BRIN index, leading to BRIN indexes being selected when they'd be a
worse choice than some other index.  This complete rewrite should be
more accurate.

Author: David Rowley, based on an earlier patch by Emre Hasegeli
Reviewed-by: Emre Hasegeli
Discussion: https://postgr.es/m/CAKJS1f9n-Wapop5Xz1dtGdpdqmzeGqQK4sV2MK-zZugfC14Xtw@mail.gmail.com
2017-04-06 17:51:53 -03:00
Peter Eisentraut
5f21f5292c Mark immutable functions in information schema as parallel safe
Also add opr_sanity check that all preloaded immutable functions are
parallel safe.  (Per discussion, this does not necessarily have to be
true for all possible such functions, but deviations would be unlikely
enough that maintaining such a test is reasonable.)

Reported-by: David Rowley <david.rowley@2ndquadrant.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
2017-04-06 14:30:13 -04:00
Peter Eisentraut
e6c9a5a9bc Fix mixup of bool and ternary value
Not currently a problem, but could be with stricter bool behavior under
stdbool or C++.

Reviewed-by: Andres Freund <andres@anarazel.de>
2017-04-06 13:09:42 -04:00
Alvaro Herrera
b1fc51a36e Comment fixes for extended statistics
Clean up some code comments in new extended statistics code, from
7b504eb282.
2017-04-06 12:28:50 -03:00
Heikki Linnakangas
07044efe00 Remove bogus SCRAM_ITERATION_LEN constant.
It was not used for what the comment claimed, at all. It was actually used
as the 'base' argument to strtol(), when reading the iteration count. We
don't need a constant for base-10, so remove it.
2017-04-06 17:41:48 +03:00
Simon Riggs
cd0cebaf7d Always SnapshotResetXmin() during ClearTransaction()
Avoid corner cases during 2PC with 6bad580d9e
2017-04-06 10:30:22 -04:00
Peter Eisentraut
3217327053 Identity columns
This is the SQL standard-conforming variant of PostgreSQL's serial
columns.  It fixes a few usability issues that serial columns have:

- CREATE TABLE / LIKE copies default but refers to same sequence
- cannot add/drop serialness with ALTER TABLE
- dropping default does not drop sequence
- need to grant separate privileges to sequence
- other slight weirdnesses because serial is some kind of special macro

Reviewed-by: Vitaly Burovoy <vitaly.burovoy@gmail.com>
2017-04-06 08:41:37 -04:00
Simon Riggs
6bad580d9e Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
For normal commits and aborts we already reset PgXact->xmin,
so we can simply avoid running SnapshotResetXmin() twice.

During performance tests by Alexander Korotkov, diagnosis
by Andres Freund showed PgXact array as a bottleneck. After
manual analysis by me of the code paths that touch those
memory locations, I was able to identify extraneous code
in the main transaction commit path.

Avoiding touching highly contented shmem improves concurrent
performance slightly on all workloads, confirmed by tests
run by Ashutosh Sharma and Alexander Korotkov.

Simon Riggs

Discussion: CANP8+jJdXE9b+b9F8CQT-LuxxO0PBCB-SZFfMVAdp+akqo4zfg@mail.gmail.com
2017-04-06 08:31:52 -04:00
Heikki Linnakangas
fd01983594 Remove dead code and fix comments in fast-path function handling.
HandleFunctionRequest() is no longer responsible for reading the protocol
message from the client, since commit 2b3a8b20c2. Fix the outdated
comments.

HandleFunctionRequest() now always returns 0, because the code that used
to return EOF was moved in 2b3a8b20c2. Therefore, the caller no longer
needs to check the return value.

Reported by Andres Freund. Backpatch to all supported versions, even though
this doesn't have any user-visible effect, to make backporting future
patches in this area easier.

Discussion: https://www.postgresql.org/message-id/20170405010525.rt5azbya5fkbhvrx@alap3.anarazel.de
2017-04-06 09:09:39 +03:00
Tom Lane
df1a699e5b Fix integer-overflow problems in interval comparison.
When using integer timestamps, the interval-comparison functions tried
to compute the overall magnitude of an interval as an int64 number of
microseconds.  As reported by Frazer McLean, this overflows for intervals
exceeding about 296000 years, which is bad since we nominally allow
intervals many times larger than that.  That results in wrong comparison
results, and possibly in corrupted btree indexes for columns containing
such large interval values.

To fix, compute the magnitude as int128 instead.  Although some compilers
have native support for int128 calculations, many don't, so create our
own support functions that can do 128-bit addition and multiplication
if the compiler support isn't there.  These support functions are designed
with an eye to allowing the int128 code paths in numeric.c to be rewritten
for use on all platforms, although this patch doesn't do that, or even
provide all the int128 primitives that will be needed for it.

Back-patch as far as 9.4.  Earlier releases did not guard against overflow
of interval values at all (commit 146604ec4 fixed that), so it seems not
very exciting to worry about overly-large intervals for them.

Before 9.6, we did not assume that unreferenced "static inline" functions
would not draw compiler warnings, so omit functions not directly referenced
by timestamp.c, the only present consumer of int128.h.  (We could have
omitted these functions in HEAD too, but since they were written and
debugged on the way to the present patch, and they look likely to be needed
by numeric.c, let's keep them in HEAD.)  I did not bother to try to prevent
such warnings in a --disable-integer-datetimes build, though.

Before 9.5, configure will never define HAVE_INT128, so the part of
int128.h that exploits a native int128 implementation is dead code in the
9.4 branch.  I didn't bother to remove it, thinking that keeping the file
looking similar in different branches is more useful.

In HEAD only, add a simple test harness for int128.h in src/tools/.

In back branches, this does not change the float-timestamps code path.
That's not subject to the same kind of overflow risk, since it computes
the interval magnitude as float8.  (No doubt, when this code was originally
written, overflow was disregarded for exactly that reason.)  There is a
precision hazard instead :-(, but we'll avert our eyes from that question,
since no complaints have been reported and that code's deprecated anyway.

Kyotaro Horiguchi and Tom Lane

Discussion: https://postgr.es/m/1490104629.422698.918452336.26FA96B7@webmail.messagingengine.com
2017-04-05 23:51:27 -04:00
Simon Riggs
2686ee1b7c Collect and use multi-column dependency stats
Follow on patch in the multi-variate statistics patch series.

CREATE STATISTICS s1 WITH (dependencies) ON (a, b) FROM t;
ANALYZE;
will collect dependency stats on (a, b) and then use the measured
dependency in subsequent query planning.

Commit 7b504eb282 added
CREATE STATISTICS with n-distinct coefficients. These are now
specified using the mutually exclusive option WITH (ndistinct).

Author: Tomas Vondra, David Rowley
Reviewed-by: Kyotaro HORIGUCHI, Álvaro Herrera, Dean Rasheed, Robert Haas
and many other comments and contributions
Discussion: https://postgr.es/m/56f40b20-c464-fad2-ff39-06b668fac47c@2ndquadrant.com
2017-04-05 18:00:42 -04:00
Peter Eisentraut
afd79873a0 Capitalize names of PLs consistently
Author: Daniel Gustafsson <daniel@yesql.se>
2017-04-05 00:38:25 -04:00
Peter Eisentraut
193f5f9e91 pageinspect: Add bt_page_items function with bytea argument
Author: Tomas Vondra <tomas.vondra@2ndquadrant.com>
Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>
2017-04-04 23:52:55 -04:00
Kevin Grittner
5ebeb579b9 Follow-on cleanup for the transition table patch.
Commit 59702716 added transition table support to PL/pgsql so that
SQL queries in trigger functions could access those transient
tables.  In order to provide the same level of support for PL/perl,
PL/python and PL/tcl, refactor the relevant code into a new
function SPI_register_trigger_data.  Call the new function in the
trigger handler of all four PLs, and document it as a public SPI
function so that authors of out-of-tree PLs can do the same.

Also get rid of a second QueryEnvironment object that was
maintained by PL/pgsql.  That was previously used to deal with
cursors, but the same approach wasn't appropriate for PLs that are
less tangled up with core code.  Instead, have SPI_cursor_open
install the connection's current QueryEnvironment, as already
happens for SPI_execute_plan.

While in the docs, remove the note that transition tables were only
supported in C and PL/pgSQL triggers, and correct some ommissions.

Thomas Munro with some work by Kevin Grittner (mostly docs)
2017-04-04 18:36:39 -05:00
Simon Riggs
9a3215026b Make min_wal_size/max_wal_size use MB internally
Previously they were defined using multiples of XLogSegSize.
Remove GUC_UNIT_XSEGS. Introduce GUC_UNIT_MB

Extracted from patch series on XLogSegSize infrastructure.

Beena Emerson
2017-04-04 18:00:01 -04:00
Simon Riggs
728bd991c3 Speedup 2PC recovery by skipping two phase state files in normal path
2PC state info held in shmem at PREPARE, then cleaned at COMMIT PREPARED/ABORT PREPARED,
avoiding writing/fsyncing any state information to disk in the normal path, greatly enhancing replay speed.
Prepared transactions that live past one checkpoint redo horizon will be written to disk as now.
Similar conceptually to 978b2f65aa and building upon
the infrastructure created by that commit.

Authors, in equal measure: Stas Kelvich, Nikhil Sontakke and Michael Paquier
Discussion: https://postgr.es/m/CAMGcDxf8Bn9ZPBBJZba9wiyQq-Qk5uqq=VjoMnRnW5s+fKST3w@mail.gmail.com
2017-04-04 15:56:56 -04:00
Robert Haas
ea69a0dead Expand hash indexes more gradually.
Since hash indexes typically have very few overflow pages, adding a
new splitpoint essentially doubles the on-disk size of the index,
which can lead to large and abrupt increases in disk usage (and
perhaps long delays on occasion).  To mitigate this problem to some
degree, divide larger splitpoints into four equal phases.  This means
that, for example, instead of growing from 4GB to 8GB all at once, a
hash index will now grow from 4GB to 5GB to 6GB to 7GB to 8GB, which
is perhaps still not as smooth as we'd like but certainly an
improvement.

This changes the on-disk format of the metapage, so bump HASH_VERSION
from 2 to 3.  This will force a REINDEX of all existing hash indexes,
but that's probably a good idea anyway.  First, hash indexes from
pre-10 versions of PostgreSQL could easily be corrupted, and we don't
want to confuse corruption carried over from an older release with any
corruption caused despite the new write-ahead logging in v10.  Second,
it will let us remove some backward-compatibility code added by commit
293e24e507.

Mithun Cy, reviewed by Amit Kapila, Jesper Pedersen and me.  Regression
test outputs updated by me.

Discussion: http://postgr.es/m/CAD__OuhG6F1gQLCgMQNnMNgoCvOLQZz9zKYJQNYvYmmJoM42gA@mail.gmail.com
Discussion: http://postgr.es/m/CA+TgmoYty0jCf-pa+m+vYUJ716+AxM7nv_syvyanyf5O-L_i2A@mail.gmail.com
2017-04-03 23:46:33 -04:00
Robert Haas
7a39b5e4d1 Abstract logic to allow for multiple kinds of child rels.
Currently, the only type of child relation is an "other member rel",
which is the child of a baserel, but in the future joins and even
upper relations may have child rels.  To facilitate that, introduce
macros that test to test for particular RelOptKind values, and use
them in various places where they help to clarify the sense of a test.
(For example, a test may allow RELOPT_OTHER_MEMBER_REL either because
it intends to allow child rels, or because it intends to allow simple
rels.)

Also, remove find_childrel_top_parent, which will not work for a
child rel that is not a baserel.  Instead, add a new RelOptInfo
member top_parent_relids to track the same kind of information in a
more generic manner.

Ashutosh Bapat, slightly tweaked by me.  Review and testing of the
patch set from which this was taken by Rajkumar Raghuwanshi and Rafia
Sabih.

Discussion: http://postgr.es/m/CA+TgmoagTnF2yqR3PT2rv=om=wJiZ4-A+ATwdnriTGku1CLYxA@mail.gmail.com
2017-04-03 22:41:31 -04:00
Peter Eisentraut
9fa6e08d4a Make header self-contained
Add necessary include files for things used in the header.
2017-04-03 16:17:45 -04:00
Tom Lane
f833c847b8 Allow psql variable substitution to occur in backtick command strings.
Previously, text between backquotes in a psql metacommand's arguments
was always passed to the shell literally.  That considerably hobbles
the usefulness of the feature for scripting, so we'd foreseen for a long
time that we'd someday want to allow substitution of psql variables into
the shell command.  IMO the addition of \if metacommands has brought us to
that point, since \if can greatly benefit from some sort of client-side
expression evaluation capability, and psql itself is not going to grow any
such thing in time for v10.  Hence, this patch.  It allows :VARIABLE to be
replaced by the exact contents of the named variable, while :'VARIABLE'
is replaced by the variable's contents suitably quoted to become a single
shell-command argument.  (The quoting rules for that are different from
those for SQL literals, so this is a bit of an abuse of the :'VARIABLE'
notation, but I doubt anyone will be confused.)

As with other situations in psql, no substitution occurs if the word
following a colon is not a known variable name.  That limits the risk of
compatibility problems for existing psql scripts; but the risk isn't zero,
so this needs to be called out in the v10 release notes.

Discussion: https://postgr.es/m/9561.1490895211@sss.pgh.pa.us
2017-04-01 21:44:54 -04:00
Kevin Grittner
41bd155dd6 Fix two undocumented parameters to functions from ENR patch.
On ProcessUtility document the parameter, to match others.

On CreateCachedPlan drop the queryEnv parameter.  It was not
referenced within the function, and had been added on the
assumption that with some unknown future usage of QueryEnvironment
it might be useful to do something there.  We have avoided other
"just in case" implementation of unused paramters, so drop it here.

Per gripe from Tom Lane
2017-04-01 15:21:05 -05:00
Alvaro Herrera
c655899ba9 BRIN de-summarization
When the BRIN summary tuple for a page range becomes too "wide" for the
values actually stored in the table (because the tuples that were
present originally are no longer present due to updates or deletes), it
can be useful to remove the outdated summary tuple, so that a future
summarization can install a tighter summary.

This commit introduces a SQL-callable interface to do so.

Author: Álvaro Herrera
Reviewed-by: Eiji Seki
Discussion: https://postgr.es/m/20170228045643.n2ri74ara4fhhfxf@alvherre.pgsql
2017-04-01 16:10:04 -03:00
Alvaro Herrera
7526e10224 BRIN auto-summarization
Previously, only VACUUM would cause a page range to get initially
summarized by BRIN indexes, which for some use cases takes too much time
since the inserts occur.  To avoid the delay, have brininsert request a
summarization run for the previous range as soon as the first tuple is
inserted into the first page of the next range.  Autovacuum is in charge
of processing these requests, after doing all the regular vacuuming/
analyzing work on tables.

This doesn't impose any new tasks on autovacuum, because autovacuum was
already in charge of doing summarizations.  The only actual effect is to
change the timing, i.e. that it occurs earlier.  For this reason, we
don't go any great lengths to record these requests very robustly; if
they are lost because of a server crash or restart, they will happen at
a later time anyway.

Most of the new code here is in autovacuum, which can now be told about
"work items" to process.  This can be used for other things such as GIN
pending list cleaning, perhaps visibility map bit setting, both of which
are currently invoked during vacuum, but do not really depend on vacuum
taking place.

The requests are at the page range level, a granularity for which we did
not have SQL-level access; we only had index-level summarization
requests via brin_summarize_new_values().  It seems reasonable to add
SQL-level access to range-level summarization too, so add a function
brin_summarize_range() to do that.

Authors: Álvaro Herrera, based on sketch from Simon Riggs.
Reviewed-by: Thomas Munro.
Discussion: https://postgr.es/m/20170301045823.vneqdqkmsd4as4ds@alvherre.pgsql
2017-04-01 14:00:53 -03:00
Kevin Grittner
18ce3a4ab2 Add infrastructure to support EphemeralNamedRelation references.
A QueryEnvironment concept is added, which allows new types of
objects to be passed into queries from parsing on through
execution.  At this point, the only thing implemented is a
collection of EphemeralNamedRelation objects -- relations which
can be referenced by name in queries, but do not exist in the
catalogs.  The only type of ENR implemented is NamedTuplestore, but
provision is made to add more types fairly easily.

An ENR can carry its own TupleDesc or reference a relation in the
catalogs by relid.

Although these features can be used without SPI, convenience
functions are added to SPI so that ENRs can easily be used by code
run through SPI.

The initial use of all this is going to be transition tables in
AFTER triggers, but that will be added to each PL as a separate
commit.

An incidental effect of this patch is to produce a more informative
error message if an attempt is made to modify the contents of a CTE
from a referencing DML statement.  No tests previously covered that
possibility, so one is added.

Kevin Grittner and Thomas Munro
Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro
with valuable comments and suggestions from many others
2017-03-31 23:17:18 -05:00
Robert Haas
7d8f6986b8 Fix parallel query so it doesn't spoil row estimates above Gather.
Commit 45be99f8cd removed GatherPath's
num_workers field, but this is entirely bogus.  Normally, a path's
parallel_workers flag is supposed to indicate the number of workers
that it wants, and should be 0 for a non-partial path.  In that
commit, I mistakenly thought that GatherPath could also use that field
to indicate the number of workers that it would try to start, but
that's disastrous, because then it can propagate up to higher nodes in
the plan tree, which will then get incorrect rowcounts because the
parallel_workers flag is involved in computing those values.  Repair
by putting the separate field back.

Report by Tomas Vondra.  Patch by me, reviewed by Amit Kapila.

Discussion: http://postgr.es/m/f91b4a44-f739-04bd-c4b6-f135bd643669@2ndquadrant.com
2017-03-31 21:01:20 -04:00
Robert Haas
2113ac4cbb Don't use bgw_main even to specify in-core bgworker entrypoints.
On EXEC_BACKEND builds, this can fail if ASLR is in use.

Backpatch to 9.5.  On master, completely remove the bgw_main field
completely, since there is no situation in which it is safe for an
EXEC_BACKEND build.  On 9.6 and 9.5, leave the field intact to avoid
breaking things for third-party code that doesn't care about working
under EXEC_BACKEND.  Prior to 9.5, there are no in-core bgworker
entrypoints.

Petr Jelinek, reviewed by me.

Discussion: http://postgr.es/m/09d8ad33-4287-a09b-a77f-77f8761adb5e@2ndquadrant.com
2017-03-31 20:43:32 -04:00
Robert Haas
c94e6942ce Don't allocate storage for partitioned tables.
Also, don't allow setting reloptions on them, since that would have no
effect given the lack of storage.  The patch does this by introducing
a new reloption kind for which there are currently no reloptions -- we
might have some in the future -- so it adjusts parseRelOptions to
handle that case correctly.

Bumped catversion.  System catalogs that contained reloptions for
partitioned tables are no longer valid; plus, there are now fewer
physical files on disk, which is not technically a catalog change but
still a good reason to re-initdb.

Amit Langote, reviewed by Maksim Milyutin and Kyotaro Horiguchi and
revised a bit by me.

Discussion: http://postgr.es/m/20170331.173326.212311140.horiguchi.kyotaro@lab.ntt.co.jp
2017-03-31 16:28:51 -04:00
Andrew Dunstan
e306df7f9c Full Text Search support for json and jsonb
The new functions are ts_headline() and to_tsvector.

Dmitry Dolgov, edited and documented by me.
2017-03-31 14:26:03 -04:00
Andrew Dunstan
c80b9920fc Transform or iterate over json(b) string values
Dmitry Dolgov, reviewed and lightly edited by me.
2017-03-31 14:25:25 -04:00
Simon Riggs
25fff40798 Default monitoring roles
Three nologin roles with non-overlapping privs are created by default
* pg_read_all_settings - read all GUCs.
* pg_read_all_stats - pg_stat_*, pg_database_size(), pg_tablespace_size()
* pg_stat_scan_tables - may lock/scan tables

Top level role - pg_monitor includes all of the above by default, plus others

Author: Dave Page
Reviewed-by: Stephen Frost, Robert Haas, Peter Eisentraut, Simon Riggs
2017-03-30 14:18:53 -04:00
Andres Freund
5ded4bd214 Remove support for version-0 calling conventions.
The V0 convention is failure prone because we've so far assumed that a
function is V0 if PG_FUNCTION_INFO_V1 is missing, leading to crashes
if a function was coded against the V1 interface.  V0 doesn't allow
proper NULL, SRF and toast handling.  V0 doesn't offer features that
V1 doesn't.

Thus remove V0 support and obsolete fmgr README contents relating to
it.

Author: Andres Freund, with contributions by Peter Eisentraut & Craig Ringer
Reviewed-By: Peter Eisentraut, Craig Ringer
Discussion: https://postgr.es/m/20161208213441.k3mbno4twhg2qf7g@alap3.anarazel.de
2017-03-30 06:25:46 -07:00
Teodor Sigaev
f90d23d0c5 Implement SortSupport for macaddr data type
Introduces a scheme to produce abbreviated keys for the macaddr type.
Bump catalog version.

Author: Brandur Leach
Reviewed-by: Julien Rouhaud, Peter Geoghegan

https://commitfest.postgresql.org/13/743/
2017-03-29 23:28:56 +03:00
Peter Eisentraut
4fdb8a82e3 Update copyright year in recently added files
Author: Masahiko Sawada <sawada.mshk@gmail.com>
2017-03-29 14:54:10 -04:00
Robert Haas
9a09527164 Mark more functions parallel-restricted.
Commit 61c2e1a95f allowed parallel
query to be used in more places, revealing via buildfarm member
mandrill that several functions intended to be called from triggers
were incorrectly marked parallel-safe rather than parallel-restricted.

Report by Tom Lane.  Patch by Rafia Sabih.  Reviewed by me.

Discussion: http://postgr.es/m/16061.1490479253@sss.pgh.pa.us
2017-03-29 10:59:27 -04:00
Peter Eisentraut
3582b223d4 Fix hardcoded typeof check result for Windows
The test result that I had blindly stipulated didn't work out on the
build farm, so disable the feature in Windows MSVC for now.
2017-03-29 08:55:34 -04:00
Peter Eisentraut
4cb824699e Cast result of copyObject() to correct type
copyObject() is declared to return void *, which allows easily assigning
the result independent of the input, but it loses all type checking.

If the compiler supports typeof or something similar, cast the result to
the input type.  This creates a greater amount of type safety.  In some
cases, where the result is assigned to a generic type such as Node * or
Expr *, new casts are now necessary, but in general casts are now
unnecessary in the normal case and indicate that something unusual is
happening.

Reviewed-by: Mark Dilger <hornschnorter@gmail.com>
2017-03-28 21:59:23 -04:00
Alvaro Herrera
ce96ce60ca Remove direct uses of ItemPointer.{ip_blkid,ip_posid}
There are no functional changes here; this simply encapsulates knowledge
of the ItemPointerData struct so that a future patch can change things
without more breakage.

All direct users of ip_blkid and ip_posid are changed to use existing
macros ItemPointerGetBlockNumber and ItemPointerGetOffsetNumber
respectively.  For callers where that's inappropriate (because they
Assert that the itempointer is is valid-looking), add
ItemPointerGetBlockNumberNoCheck and ItemPointerGetOffsetNumberNoCheck,
which lack the assertion but are otherwise identical.

Author: Pavan Deolasee
Discussion: https://postgr.es/m/CABOikdNnFon4cJiL=h1mZH3bgUeU+sWHuU4Yr8AB=j3A2p1GiA@mail.gmail.com
2017-03-28 19:02:23 -03:00
Teodor Sigaev
ab89e465cb Altering default privileges on schemas
Extend ALTER DEFAULT PRIVILEGES command to schemas.

Author: Matheus Oliveira
Reviewed-by: Petr Jelínek, Ashutosh Sharma

https://commitfest.postgresql.org/13/887/
2017-03-28 18:58:55 +03:00
Simon Riggs
ff539da316 Cleanup slots during drop database
Automatically drop all logical replication slots associated with a
database when the database is dropped. Previously we threw an ERROR
if a slot existed. Now we throw ERROR only if a slot is active in
the database being dropped.

Craig Ringer
2017-03-28 10:05:21 -04:00
Alvaro Herrera
6462238f0d Fix uninitialized memory propagation mistakes
Valgrind complains that some uninitialized bytes are being passed around
by the extended statistics code since commit 7b504eb282, as reported
by Andres Freund.  Silence it.

Tomas Vondra submitted a patch which he verified to fix the complaints
in his machine; however I messed with it a bit before pushing, so any
remaining problems are likely my (Álvaro's) fault.

Author: Tomas Vondra
Discussion: https://postgr.es/m/20170325211031.4xxoptigqxm2emn2@alap3.anarazel.de
2017-03-27 14:52:19 -03:00
Robert Haas
c4c51541e2 Still more code review for single-page hash vacuuming.
Most seriously, fix use of incorrect block ID, per a report from
Jeff Janes that it causes a crash and a diagnosis from Amit Kapila.

Improve consistency between the hash and btree versions of this
code by adding back a PANIC that btree has, and by registering
data in the xlog record in the same way, per complaints from
Jeff Janes and Amit Kapila.

Tidy up some minor cosmetic points, per complaints from Amit
Kapila.

Patch by Ashutosh Sharma, reviewed by Amit Kapila, and tested by
Jeff Janes.

Discussion: http://postgr.es/m/CAMkU=1w-9Qe=Ff1o6bSaXpNO9wqpo7_9GL8_CVhw4BoVVHasqg@mail.gmail.com
2017-03-27 12:51:10 -04:00
Teodor Sigaev
1b02be21f2 Fsync directory after creating or unlinking file.
If file was created/deleted just before powerloss it's possible that
file system will miss that. To prevent it, call fsync() where creating/
unlinkg file is critical.

Author: Michael Paquier
Reviewed-by: Ashutosh Bapat, Takayuki Tsunakawa, me
2017-03-27 19:33:01 +03:00
Andrew Gierth
b5635948ab Support hashed aggregation with grouping sets.
This extends the Aggregate node with two new features: HashAggregate
can now run multiple hashtables concurrently, and a new strategy
MixedAggregate populates hashtables while doing sorted grouping.

The planner will now attempt to save as many sorts as possible when
planning grouping sets queries, while not exceeding work_mem for the
estimated combined sizes of all hashtables used.  No SQL-level changes
are required.  There should be no user-visible impact other than the
new EXPLAIN output and possible changes to result ordering when ORDER
BY was not used (which affected a few regression tests).  The
enable_hashagg option is respected.

Author: Andrew Gierth
Reviewers: Mark Dilger, Andres Freund
Discussion: https://postgr.es/m/87vatszyhj.fsf@news-spur.riddles.org.uk
2017-03-27 04:20:54 +01:00
Robert Haas
fc70a4b0df Show more processes in pg_stat_activity.
Previously, auxiliary processes and background workers not connected
to a database (such as the logical replication launcher) weren't
shown.  Include them, so that we can see the associated wait state
information.  Add a new column to identify the processes type, so that
people can filter them out easily using SQL if they wish.

Before this patch was written, there was discussion about whether we
should expose this information in a separate view, so as to avoid
contaminating pg_stat_activity with things people might not want to
see.  But putting everything in pg_stat_activity was a more popular
choice, so that's what the patch does.

Kuntal Ghosh, reviewed by Amit Langote and Michael Paquier.  Some
revisions and bug fixes by me.

Discussion: http://postgr.es/m/CA+TgmoYES5nhkEGw9nZXU8_FhA8XEm8NTm3-SO+3ML1B81Hkww@mail.gmail.com
2017-03-26 22:02:22 -04:00
Tom Lane
2f0903ea19 Improve performance of ExecEvalWholeRowVar.
In commit b8d7f053c, we needed to fix ExecEvalWholeRowVar to not change
the state of the slot it's copying.  The initial quick hack at that
required two rounds of tuple construction, which is not very nice.
To fix, add another primitive to tuptoaster.c that does precisely what
we need.  (I initially tried to do this by refactoring one of the
existing functions into two pieces; but it looked like that might hurt
performance for the existing case, and the amount of code that could
be shared is not very large, so I gave up on that.)

Discussion: https://postgr.es/m/26088.1490315792@sss.pgh.pa.us
2017-03-26 19:14:57 -04:00
Peter Eisentraut
895f93701f Fix cpluspluscheck warning
Structure tag cannot be the same as a typedef that is a pointer to that
struct.
2017-03-26 18:31:05 -04:00
Tom Lane
5459cfd3ad Fix typos in logical replication support for initial data copy.
Fix an incorrect assert condition (noted by Coverity), and spell the new
name of the function correctly.  Typos introduced in commit 7c4f52409.

Michael Paquier
2017-03-26 17:44:35 -04:00
Andres Freund
b8d7f053c5 Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.

This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.

The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
  function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
  out operation metadata sequentially; including the avoidance of
  nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
  constant re-checks at evaluation time

Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.

The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
  overhead of expression evaluation, by caching state in prepared
  statements.  That'd be helpful in OLTPish scenarios where
  initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
  work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
  been made here too.

The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
  initialization, whereas previously they were done during
  execution. In edge cases this can lead to errors being raised that
  previously wouldn't have been, e.g. a NULL array being coerced to a
  different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
  during expression initialization, previously it was re-built
  every time a domain check was evaluated. For normal queries this
  doesn't change much, but e.g. for plpgsql functions, which caches
  ExprStates, the old set could stick around longer.  The behavior
  around might still change.

Author: Andres Freund, with significant changes by Tom Lane,
	changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-25 14:52:06 -07:00
Simon Riggs
5737c12df0 Report catalog_xmin separately in hot_standby_feedback
If the upstream walsender is using a physical replication slot, store the
catalog_xmin in the slot's catalog_xmin field. If the upstream doesn't use a
slot and has only a PGPROC entry behaviour doesn't change, as we store the
combined xmin and catalog_xmin in the PGPROC entry.

Author: Craig Ringer
2017-03-25 14:07:27 +00:00
Peter Eisentraut
7678fe1c6d Make header self-contained
Add necessary include files for things used in the header.
2017-03-24 23:36:10 -04:00
Simon Riggs
3428ef7911 Reverting 42b4b0b241
Buildfarm issues and other reported issues
2017-03-24 17:56:17 +00:00
Alvaro Herrera
7b504eb282 Implement multivariate n-distinct coefficients
Add support for explicitly declared statistic objects (CREATE
STATISTICS), allowing collection of statistics on more complex
combinations that individual table columns.  Companion commands DROP
STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are
added too.  All this DDL has been designed so that more statistic types
can be added later on, such as multivariate most-common-values and
multivariate histograms between columns of a single table, leaving room
for permitting columns on multiple tables, too, as well as expressions.

This commit only adds support for collection of n-distinct coefficient
on user-specified sets of columns in a single table.  This is useful to
estimate number of distinct groups in GROUP BY and DISTINCT clauses;
estimation errors there can cause over-allocation of memory in hashed
aggregates, for instance, so it's a worthwhile problem to solve.  A new
special pseudo-type pg_ndistinct is used.

(num-distinct estimation was deemed sufficiently useful by itself that
this is worthwhile even if no further statistic types are added
immediately; so much so that another version of essentially the same
functionality was submitted by Kyotaro Horiguchi:
https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp
though this commit does not use that code.)

Author: Tomas Vondra.  Some code rework by Álvaro.
Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes,
    Ideriha Takeshi
Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz
    https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 14:06:10 -03:00
Robert Haas
857ee8e391 Add a txid_status function.
If your connection to the database server is lost while a COMMIT is
in progress, it may be difficult to figure out whether the COMMIT was
successful or not.  This function will tell you, provided that you
don't wait too long to ask.  It may be useful in other situations,
too.

Craig Ringer, reviewed by Simon Riggs and by me

Discussion: http://postgr.es/m/CAMsr+YHQiWNEi0daCTboS40T+V5s_+dst3PYv_8v2wNVH+Xx4g@mail.gmail.com
2017-03-24 12:00:53 -04:00
Simon Riggs
42b4b0b241 Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
For normal commits and aborts we already reset PgXact->xmin
Avoiding touching highly contented shmem improves concurrent
performance.

Simon Riggs

Discussion: CANP8+jJdXE9b+b9F8CQT-LuxxO0PBCB-SZFfMVAdp+akqo4zfg@mail.gmail.com
2017-03-24 14:20:59 +00:00
Heikki Linnakangas
7ac955b347 Allow SCRAM authentication, when pg_hba.conf says 'md5'.
If a user has a SCRAM verifier in pg_authid.rolpassword, there's no reason
we cannot attempt to perform SCRAM authentication instead of MD5. The worst
that can happen is that the client doesn't support SCRAM, and the
authentication will fail. But previously, it would fail for sure, because
we would not even try. SCRAM is strictly more secure than MD5, so there's
no harm in trying it. This allows for a more graceful transition from MD5
passwords to SCRAM, as user passwords can be changed to SCRAM verifiers
incrementally, without changing pg_hba.conf.

Refactor the code in auth.c to support that better. Notably, we now have to
look up the user's pg_authid entry before sending the password challenge,
also when performing MD5 authentication. Also simplify the concept of a
"doomed" authentication. Previously, if a user had a password, but it had
expired, we still performed SCRAM authentication (but always returned error
at the end) using the salt and iteration count from the expired password.
Now we construct a fake salt, like we do when the user doesn't have a
password or doesn't exist at all. That simplifies get_role_password(), and
we can don't need to distinguish the  "user has expired password", and
"user does not exist" cases in auth.c.

On second thoughts, also rename uaSASL to uaSCRAM. It refers to the
mechanism specified in pg_hba.conf, and while we use SASL for SCRAM
authentication at the protocol level, the mechanism should be called SCRAM,
not SASL. As a comparison, we have uaLDAP, even though it looks like the
plain 'password' authentication at the protocol level.

Discussion: https://www.postgresql.org/message-id/6425.1489506016@sss.pgh.pa.us
Reviewed-by: Michael Paquier
2017-03-24 13:32:21 +02:00
Teodor Sigaev
78874531ba Fix backup canceling
Assert-enabled build crashes but without asserts it works by wrong way:
it may not reset forcing full page write and preventing from starting
exclusive backup with the same name as cancelled.
Patch replaces pair of booleans
nonexclusive_backup_running/exclusive_backup_running to single enum to
correctly describe backup state.

Backpatch to 9.6 where bug was introduced

Reported-by: David Steele
Authors: Michael Paquier, David Steele
Reviewed-by: Anastasia Lubennikova

https://commitfest.postgresql.org/13/1068/
2017-03-24 13:53:40 +03:00
Tom Lane
457a444873 Avoid syntax error on platforms that have neither LOCALE_T nor ICU.
Buildfarm member anole sees this union as empty, and doesn't like it.
2017-03-23 23:18:52 -04:00
Robert Haas
c23b186ae9 Fix enum definition.
Commit 249cf070e3 assigned to one of
the labels in the middle the value that should have been assigned
to the first member of the enum.  Rushabh's patch didn't have that
defect as submitted, but I managed to mess it up while editing.
Repair.
2017-03-23 16:10:43 -04:00
Peter Eisentraut
eccfef81e1 ICU support
Add a column collprovider to pg_collation that determines which library
provides the collation data.  The existing choices are default and libc,
and this adds an icu choice, which uses the ICU4C library.

The pg_locale_t type is changed to a union that contains the
provider-specific locale handles.  Users of locale information are
changed to look into that struct for the appropriate handle to use.

Also add a collversion column that records the version of the collation
when it is created, and check at run time whether it is still the same.
This detects potentially incompatible library upgrades that can corrupt
indexes and other structures.  This is currently only supported by
ICU-provided collations.

initdb initializes the default collation set as before from the `locale
-a` output but also adds all available ICU locales with a "-x-icu"
appended.

Currently, ICU-provided collations can only be explicitly named
collations.  The global database locales are still always libc-provided.

ICU support is enabled by configure --with-icu.

Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
2017-03-23 15:28:48 -04:00
Robert Haas
ea42cc18c3 Track the oldest XID that can be safely looked up in CLOG.
This provides infrastructure for looking up arbitrary, user-supplied
XIDs without a risk of scary-looking failures from within the clog
module.  Normally, the oldest XID that can be safely looked up in CLOG
is the same as the oldest XID that can reused without causing
wraparound, and the latter is already tracked.  However, while
truncation is in progress, the values are different, so we must
keep track of them separately.

Craig Ringer, reviewed by Simon Riggs and by me.

Discussion: http://postgr.es/m/CAMsr+YHQiWNEi0daCTboS40T+V5s_+dst3PYv_8v2wNVH+Xx4g@mail.gmail.com
2017-03-23 14:26:31 -04:00
Robert Haas
691b8d5928 Allow for parallel execution whenever ExecutorRun() is done only once.
Previously, it was unsafe to execute a plan in parallel if
ExecutorRun() might be called with a non-zero row count.  However,
it's quite easy to fix things up so that we can support that case,
provided that it is known that we will never call ExecutorRun() a
second time for the same QueryDesc.  Add infrastructure to signal
this, and cross-checks to make sure that a caller who claims this is
true doesn't later reneg.

While that pattern never happens with queries received directly from a
client -- there's no way to know whether multiple Execute messages
will be sent unless the first one requests all the rows -- it's pretty
common for queries originating from procedural languages, which often
limit the result to a single tuple or to a user-specified number of
tuples.

This commit doesn't actually enable parallelism in any additional
cases, because currently none of the places that would be able to
benefit from this infrastructure pass CURSOR_OPT_PARALLEL_OK in the
first place, but it makes it much more palatable to pass
CURSOR_OPT_PARALLEL_OK in places where we currently don't, because it
eliminates some cases where we'd end up having to run the parallel
plan serially.

Patch by me, based on some ideas from Rafia Sabih and corrected by
Rafia Sabih based on feedback from Dilip Kumar and myself.

Discussion: http://postgr.es/m/CA+TgmobXEhvHbJtWDuPZM9bVSLiTj-kShxQJ2uM5GPDze9fRYA@mail.gmail.com
2017-03-23 13:14:36 -04:00
Teodor Sigaev
218f51584d Reduce page locking in GIN vacuum
GIN vacuum during cleaning posting tree can lock this whole tree for a long
time with by holding LockBufferForCleanup() on root. Patch changes it with
two ways: first, cleanup lock will be taken only if there is an empty page
(which should be deleted) and, second, it tries to lock only subtree, not the
whole posting tree.

Author: Andrey Borodin with minor editorization by me
Reviewed-by: Jeff Davis, me

https://commitfest.postgresql.org/13/896/
2017-03-23 19:38:47 +03:00
Peter Eisentraut
73561013e5 Remove trailing comma from enum definition
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
2017-03-23 11:58:11 -04:00
Peter Eisentraut
128e6ee01d Assorted compilation and test fixes
related to 7c4f52409a, per build farm

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
2017-03-23 11:44:43 -04:00
Simon Riggs
6912acc04f Replication lag tracking for walsenders
Adds write_lag, flush_lag and replay_lag cols to pg_stat_replication.

Implements a lag tracker module that reports the lag times based upon
measurements of the time taken for recent WAL to be written, flushed and
replayed and for the sender to hear about it. These times
represent the commit lag that was (or would have been) introduced by each
synchronous commit level, if the remote server was configured as a
synchronous standby.  For an asynchronous standby, the replay_lag column
approximates the delay before recent transactions became visible to queries.
If the standby server has entirely caught up with the sending server and
there is no more WAL activity, the most recently measured lag times will
continue to be displayed for a short time and then show NULL.

Physical replication lag tracking is automatic. Logical replication tracking
is possible but is the responsibility of the logical decoding plugin.
Tracking is a private module operating within each walsender individually,
with values reported to shared memory. Module not used outside of walsender.

Design and code is good enough now to commit - kudos to the author.
In many ways a difficult topic, with important and subtle behaviour so this
shoudl be expected to generate discussion and multiple open items: Test now!

Author: Thomas Munro, following designs by Fujii Masao and Simon Riggs
Review: Simon Riggs, Ian Barwick and Craig Ringer
2017-03-23 14:05:28 +00:00
Peter Eisentraut
7c4f52409a Logical replication support for initial data copy
Add functionality for a new subscription to copy the initial data in the
tables and then sync with the ongoing apply process.

For the copying, add a new internal COPY option to have the COPY source
data provided by a callback function.  The initial data copy works on
the subscriber by receiving COPY data from the publisher and then
providing it locally into a COPY that writes to the destination table.

A WAL receiver can now execute full SQL commands.  This is used here to
obtain information about tables and publications.

Several new options were added to CREATE and ALTER SUBSCRIPTION to
control whether and when initial table syncing happens.

Change pg_dump option --no-create-subscription-slots to
--no-subscription-connect and use the new CREATE SUBSCRIPTION
... NOCONNECT option for that.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Tested-by: Erik Rijkers <er@xs4all.nl>
2017-03-23 08:55:37 -04:00
Stephen Frost
017e4f2588 Expose waitforarchive option through pg_stop_backup()
Internally, we have supported the option to either wait for all of the
WAL associated with a backup to be archived, or to return immediately.
This option is useful to users of pg_stop_backup() as well, when they
are reading the stop backup record position and checking that the WAL
they need has been archived independently.

This patch adds an additional, optional, argument to pg_stop_backup()
which allows the user to indicate if they wish to wait for the WAL to be
archived or not.  The default matches current behavior, which is to
wait.

Author: David Steele, with some minor changes, doc updates by me.
Reviewed by: Takayuki Tsunakawa, Fujii Masao
Discussion: https://postgr.es/m/758e3fd1-45b4-5e28-75cd-e9e7f93a4c02@pgmasters.net
2017-03-22 23:44:58 -04:00
Magnus Hagander
6b76f1bb58 Support multiple RADIUS servers
This changes all the RADIUS related parameters (radiusserver,
radiussecret, radiusport, radiusidentifier) to be plural and to accept a
comma separated list of servers, which will be tried in order.

Reviewed by Adam Brightwell
2017-03-22 18:11:08 +01:00
Simon Riggs
af4b1a0869 Refactor GetOldestXmin() to use flags
Replace ignoreVacuum parameter with more flexible flags.

Author: Eiji Seki
Review: Haribabu Kommi
2017-03-22 16:51:01 +00:00
Andrew Dunstan
96a7128b7b Sync pg_dump and pg_dumpall output
Before exiting any files are fsync'ed. A --no-sync option is also
provided for a faster exit if desired.

Michael Paquier.

Reviewed by Albe Laurenz

Discussion: https://postgr.es/m/CAB7nPqS1uZ=Ov+UruW6jr3vB-S_DLVMPc0dQpV-fTDjmm0ZQMg@mail.gmail.com
2017-03-22 10:20:13 -04:00
Simon Riggs
9b013dc238 Improve performance of replay of AccessExclusiveLocks
A hot standby replica keeps a list of Access Exclusive locks for a top
level transaction. These locks are released when the top level transaction
ends. Searching of this list is O(N^2), and each transaction had to pay the
price of searching this list for locks, even if it didn't take any AE
locks itself.

This patch optimizes this case by having the master server track which
transactions took AE locks, and passes that along to the standby server in
the commit/abort record. This allows the standby to only try to release
locks for transactions which actually took any, avoiding the majority of
the performance issue.

Refactor MyXactAccessedTempRel into MyXactFlags to allow minimal additional
cruft with this.

Analysis and initial patch by David Rowley
Author: David Rowley and Simon Riggs
2017-03-22 13:09:36 +00:00
Simon Riggs
1148e22a82 Teach xlogreader to follow timeline switches
Uses page-based mechanism to ensure we’re using the correct timeline.

Tests are included to exercise the functionality using a cold disk-level copy
of the master that's started up as a replica with slots intact, but the
intended use of the functionality is with later features.

Craig Ringer, reviewed by Simon Riggs and Andres Freund
2017-03-22 07:05:12 +00:00
Robert Haas
d3cc37f1d8 Don't scan partitioned tables.
Partitioned tables do not contain any data; only their unpartitioned
descendents need to be scanned.  However, the partitioned tables still
need to be locked, even though they're not scanned.  To make that
work, Append and MergeAppend relations now need to carry a list of
(unscanned) partitioned relations that must be locked, and InitPlan
must lock all partitioned result relations.

Aside from the obvious advantage of avoiding some work at execution
time, this has two other advantages.  First, it may improve the
planner's decision-making in some cases since the empty relation
might throw things off.  Second, it paves the way to getting rid of
the storage for partitioned tables altogether.

Amit Langote, reviewed by me.

Discussion: http://postgr.es/m/6837c359-45c4-8044-34d1-736756335a15@lab.ntt.co.jp
2017-03-21 09:48:04 -04:00
Andrew Dunstan
29bf501683 Add a direct function call mechanism using the caller's context.
The current DirectFunctionCall functions use NULL as the flinfo in
initializing the FunctionCallInfoData for the call. That means the
called function has no fn_mcxt or fn_extra to work with, and attempting
to do so will result in an access violation. These functions instead use
the provided flinfo, which will usually be the caller's own flinfo. The
caller needs to ensure that it doesn't use the fn_extra in way that is
incompatible with the way the called function will use it. The called
function should not rely on anything else in the provided context, as it
will be relevant to the caller, not the callee.

Original code from Tom Lane.

Discussion: https://postgr.es/m/db2b70a4-78d7-294a-a315-8e7f506c5978@2ndQuadrant.com
2017-03-21 08:57:46 -04:00
Andrew Dunstan
b6fb534f10 Add IF NOT EXISTS for CREATE SERVER and CREATE USER MAPPING
There is still some inconsistency with the error messages surrounding
foreign servers. Some use the word "foreign" and some don't. My
inclination is to remove all such uses of "foreign" on the basis that
the  CREATE/ALTER/DROP SERVER commands don't use the word. However, that
is left for another day. In this patch I have kept to the existing usage
in the affected commands, which omits "foreign".

Anastasia Lubennikova, reviewed by Arthur Zakirov and Ashtosh Bapat.

Discussion: http://postgr.es/m/7c2ab9b8-388a-1ce0-23a3-7acf2a0ed3c6@postgrespro.ru
2017-03-20 16:40:45 -04:00
Robert Haas
953477ca35 Fixes for single-page hash index vacuum.
Clear LH_PAGE_HAS_DEAD_TUPLES during replay, similar to what gets done
for btree.  Update hashdesc.c for xl_hash_vacuum_one_page.

Oversights in commit 6977b8b7f4 spotted
by Amit Kapila.  Patch by Ashutosh Sharma.

Bump WAL version.  The original patch to make hash indexes write-ahead
logged probably should have done this, and the single page vacuuming
patch probably should have done it again, but better late than never.

Discussion: http://postgr.es/m/CAA4eK1Kd=mJ9xreovcsh0qMiAj-QqCphHVQ_Lfau1DR9oVjASQ@mail.gmail.com
2017-03-20 15:49:09 -04:00
Tom Lane
bc18126a6b Add configure test to see if the C compiler has gcc-style computed gotos.
We'll need this for the upcoming patch to speed up expression evaluation.
Might as well push it now to see if it behaves sanely in the buildfarm.

Andres Freund

Discussion: https://postgr.es/m/20170320062511.hp5qeurtxrwsvfxr@alap3.anarazel.de
2017-03-20 13:35:26 -04:00
Tom Lane
17f8ffa1e3 Fix REFRESH MATERIALIZED VIEW to report activity to the stats collector.
The non-concurrent code path for REFRESH MATERIALIZED VIEW failed to
report its updates to the stats collector.  This is bad since it means
auto-analyze doesn't know there's any work to be done.  Adjust it to
report the refresh as a table truncate followed by insertion of an
appropriate number of rows.

Since a matview could contain more than INT_MAX rows, change the
signature of pgstat_count_heap_insert() to accept an int64 rowcount.
(The accumulator it's adding into is already int64, but existing
callers could not insert more than a small number of rows at once,
so the argument had been declared just "int n".)

This is surely a bug fix, but changing pgstat_count_heap_insert()'s API
seems too risky for the back branches.  Given the lack of previous
complaints, I'm not sure it's a big enough problem to justify a kluge
solution that would avoid that.  So, no back-patch, at least for now.

Jim Mlodgenski, adjusted a bit by me

Discussion: https://postgr.es/m/CAB_5SRchSz7-WmdO5szdiknG8Oj_GGqJytrk1KRd11yhcMs1KQ@mail.gmail.com
2017-03-18 17:49:39 -04:00
Robert Haas
249cf070e3 Create and use wait events for read, write, and fsync operations.
Previous commits, notably 53be0b1add and
6f3bd98ebf, made it possible to see from
pg_stat_activity when a backend was stuck waiting for another backend,
but it's also fairly common for a backend to be stuck waiting for an
I/O.  Add wait events for those operations, too.

Rushabh Lathia, with further hacking by me.  Reviewed and tested by
Michael Paquier, Amit Kapila, Rajkumar Raghuwanshi, and Rahila Syed.

Discussion: http://postgr.es/m/CAGPqQf0LsYHXREPAZqYGVkDqHSyjf=KsD=k0GTVPAuzyThh-VQ@mail.gmail.com
2017-03-18 07:43:01 -04:00
Robert Haas
88e66d193f Rename "pg_clog" directory to "pg_xact".
Names containing the letters "log" sometimes confuse users into
believing that only non-critical data is present.  It is hoped
this renaming will discourage ill-considered removals of transaction
status data.

Michael Paquier

Discussion: http://postgr.es/m/CA+Tgmoa9xFQyjRZupbdEFuwUerFTvC6HjZq1ud6GYragGDFFgA@mail.gmail.com
2017-03-17 09:48:38 -04:00
Heikki Linnakangas
c6305a9c57 Allow plaintext 'password' authentication when user has a SCRAM verifier.
Oversight in the main SCRAM patch.
2017-03-17 11:33:27 +02:00