Commit graph

7685 commits

Author SHA1 Message Date
Robert Haas
ea69a0dead Expand hash indexes more gradually.
Since hash indexes typically have very few overflow pages, adding a
new splitpoint essentially doubles the on-disk size of the index,
which can lead to large and abrupt increases in disk usage (and
perhaps long delays on occasion).  To mitigate this problem to some
degree, divide larger splitpoints into four equal phases.  This means
that, for example, instead of growing from 4GB to 8GB all at once, a
hash index will now grow from 4GB to 5GB to 6GB to 7GB to 8GB, which
is perhaps still not as smooth as we'd like but certainly an
improvement.

This changes the on-disk format of the metapage, so bump HASH_VERSION
from 2 to 3.  This will force a REINDEX of all existing hash indexes,
but that's probably a good idea anyway.  First, hash indexes from
pre-10 versions of PostgreSQL could easily be corrupted, and we don't
want to confuse corruption carried over from an older release with any
corruption caused despite the new write-ahead logging in v10.  Second,
it will let us remove some backward-compatibility code added by commit
293e24e507.

Mithun Cy, reviewed by Amit Kapila, Jesper Pedersen and me.  Regression
test outputs updated by me.

Discussion: http://postgr.es/m/CAD__OuhG6F1gQLCgMQNnMNgoCvOLQZz9zKYJQNYvYmmJoM42gA@mail.gmail.com
Discussion: http://postgr.es/m/CA+TgmoYty0jCf-pa+m+vYUJ716+AxM7nv_syvyanyf5O-L_i2A@mail.gmail.com
2017-04-03 23:46:33 -04:00
Robert Haas
7a39b5e4d1 Abstract logic to allow for multiple kinds of child rels.
Currently, the only type of child relation is an "other member rel",
which is the child of a baserel, but in the future joins and even
upper relations may have child rels.  To facilitate that, introduce
macros that test to test for particular RelOptKind values, and use
them in various places where they help to clarify the sense of a test.
(For example, a test may allow RELOPT_OTHER_MEMBER_REL either because
it intends to allow child rels, or because it intends to allow simple
rels.)

Also, remove find_childrel_top_parent, which will not work for a
child rel that is not a baserel.  Instead, add a new RelOptInfo
member top_parent_relids to track the same kind of information in a
more generic manner.

Ashutosh Bapat, slightly tweaked by me.  Review and testing of the
patch set from which this was taken by Rajkumar Raghuwanshi and Rafia
Sabih.

Discussion: http://postgr.es/m/CA+TgmoagTnF2yqR3PT2rv=om=wJiZ4-A+ATwdnriTGku1CLYxA@mail.gmail.com
2017-04-03 22:41:31 -04:00
Peter Eisentraut
9fa6e08d4a Make header self-contained
Add necessary include files for things used in the header.
2017-04-03 16:17:45 -04:00
Tom Lane
f833c847b8 Allow psql variable substitution to occur in backtick command strings.
Previously, text between backquotes in a psql metacommand's arguments
was always passed to the shell literally.  That considerably hobbles
the usefulness of the feature for scripting, so we'd foreseen for a long
time that we'd someday want to allow substitution of psql variables into
the shell command.  IMO the addition of \if metacommands has brought us to
that point, since \if can greatly benefit from some sort of client-side
expression evaluation capability, and psql itself is not going to grow any
such thing in time for v10.  Hence, this patch.  It allows :VARIABLE to be
replaced by the exact contents of the named variable, while :'VARIABLE'
is replaced by the variable's contents suitably quoted to become a single
shell-command argument.  (The quoting rules for that are different from
those for SQL literals, so this is a bit of an abuse of the :'VARIABLE'
notation, but I doubt anyone will be confused.)

As with other situations in psql, no substitution occurs if the word
following a colon is not a known variable name.  That limits the risk of
compatibility problems for existing psql scripts; but the risk isn't zero,
so this needs to be called out in the v10 release notes.

Discussion: https://postgr.es/m/9561.1490895211@sss.pgh.pa.us
2017-04-01 21:44:54 -04:00
Kevin Grittner
41bd155dd6 Fix two undocumented parameters to functions from ENR patch.
On ProcessUtility document the parameter, to match others.

On CreateCachedPlan drop the queryEnv parameter.  It was not
referenced within the function, and had been added on the
assumption that with some unknown future usage of QueryEnvironment
it might be useful to do something there.  We have avoided other
"just in case" implementation of unused paramters, so drop it here.

Per gripe from Tom Lane
2017-04-01 15:21:05 -05:00
Alvaro Herrera
c655899ba9 BRIN de-summarization
When the BRIN summary tuple for a page range becomes too "wide" for the
values actually stored in the table (because the tuples that were
present originally are no longer present due to updates or deletes), it
can be useful to remove the outdated summary tuple, so that a future
summarization can install a tighter summary.

This commit introduces a SQL-callable interface to do so.

Author: Álvaro Herrera
Reviewed-by: Eiji Seki
Discussion: https://postgr.es/m/20170228045643.n2ri74ara4fhhfxf@alvherre.pgsql
2017-04-01 16:10:04 -03:00
Alvaro Herrera
7526e10224 BRIN auto-summarization
Previously, only VACUUM would cause a page range to get initially
summarized by BRIN indexes, which for some use cases takes too much time
since the inserts occur.  To avoid the delay, have brininsert request a
summarization run for the previous range as soon as the first tuple is
inserted into the first page of the next range.  Autovacuum is in charge
of processing these requests, after doing all the regular vacuuming/
analyzing work on tables.

This doesn't impose any new tasks on autovacuum, because autovacuum was
already in charge of doing summarizations.  The only actual effect is to
change the timing, i.e. that it occurs earlier.  For this reason, we
don't go any great lengths to record these requests very robustly; if
they are lost because of a server crash or restart, they will happen at
a later time anyway.

Most of the new code here is in autovacuum, which can now be told about
"work items" to process.  This can be used for other things such as GIN
pending list cleaning, perhaps visibility map bit setting, both of which
are currently invoked during vacuum, but do not really depend on vacuum
taking place.

The requests are at the page range level, a granularity for which we did
not have SQL-level access; we only had index-level summarization
requests via brin_summarize_new_values().  It seems reasonable to add
SQL-level access to range-level summarization too, so add a function
brin_summarize_range() to do that.

Authors: Álvaro Herrera, based on sketch from Simon Riggs.
Reviewed-by: Thomas Munro.
Discussion: https://postgr.es/m/20170301045823.vneqdqkmsd4as4ds@alvherre.pgsql
2017-04-01 14:00:53 -03:00
Kevin Grittner
18ce3a4ab2 Add infrastructure to support EphemeralNamedRelation references.
A QueryEnvironment concept is added, which allows new types of
objects to be passed into queries from parsing on through
execution.  At this point, the only thing implemented is a
collection of EphemeralNamedRelation objects -- relations which
can be referenced by name in queries, but do not exist in the
catalogs.  The only type of ENR implemented is NamedTuplestore, but
provision is made to add more types fairly easily.

An ENR can carry its own TupleDesc or reference a relation in the
catalogs by relid.

Although these features can be used without SPI, convenience
functions are added to SPI so that ENRs can easily be used by code
run through SPI.

The initial use of all this is going to be transition tables in
AFTER triggers, but that will be added to each PL as a separate
commit.

An incidental effect of this patch is to produce a more informative
error message if an attempt is made to modify the contents of a CTE
from a referencing DML statement.  No tests previously covered that
possibility, so one is added.

Kevin Grittner and Thomas Munro
Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro
with valuable comments and suggestions from many others
2017-03-31 23:17:18 -05:00
Robert Haas
7d8f6986b8 Fix parallel query so it doesn't spoil row estimates above Gather.
Commit 45be99f8cd removed GatherPath's
num_workers field, but this is entirely bogus.  Normally, a path's
parallel_workers flag is supposed to indicate the number of workers
that it wants, and should be 0 for a non-partial path.  In that
commit, I mistakenly thought that GatherPath could also use that field
to indicate the number of workers that it would try to start, but
that's disastrous, because then it can propagate up to higher nodes in
the plan tree, which will then get incorrect rowcounts because the
parallel_workers flag is involved in computing those values.  Repair
by putting the separate field back.

Report by Tomas Vondra.  Patch by me, reviewed by Amit Kapila.

Discussion: http://postgr.es/m/f91b4a44-f739-04bd-c4b6-f135bd643669@2ndquadrant.com
2017-03-31 21:01:20 -04:00
Robert Haas
2113ac4cbb Don't use bgw_main even to specify in-core bgworker entrypoints.
On EXEC_BACKEND builds, this can fail if ASLR is in use.

Backpatch to 9.5.  On master, completely remove the bgw_main field
completely, since there is no situation in which it is safe for an
EXEC_BACKEND build.  On 9.6 and 9.5, leave the field intact to avoid
breaking things for third-party code that doesn't care about working
under EXEC_BACKEND.  Prior to 9.5, there are no in-core bgworker
entrypoints.

Petr Jelinek, reviewed by me.

Discussion: http://postgr.es/m/09d8ad33-4287-a09b-a77f-77f8761adb5e@2ndquadrant.com
2017-03-31 20:43:32 -04:00
Robert Haas
c94e6942ce Don't allocate storage for partitioned tables.
Also, don't allow setting reloptions on them, since that would have no
effect given the lack of storage.  The patch does this by introducing
a new reloption kind for which there are currently no reloptions -- we
might have some in the future -- so it adjusts parseRelOptions to
handle that case correctly.

Bumped catversion.  System catalogs that contained reloptions for
partitioned tables are no longer valid; plus, there are now fewer
physical files on disk, which is not technically a catalog change but
still a good reason to re-initdb.

Amit Langote, reviewed by Maksim Milyutin and Kyotaro Horiguchi and
revised a bit by me.

Discussion: http://postgr.es/m/20170331.173326.212311140.horiguchi.kyotaro@lab.ntt.co.jp
2017-03-31 16:28:51 -04:00
Andrew Dunstan
e306df7f9c Full Text Search support for json and jsonb
The new functions are ts_headline() and to_tsvector.

Dmitry Dolgov, edited and documented by me.
2017-03-31 14:26:03 -04:00
Andrew Dunstan
c80b9920fc Transform or iterate over json(b) string values
Dmitry Dolgov, reviewed and lightly edited by me.
2017-03-31 14:25:25 -04:00
Simon Riggs
25fff40798 Default monitoring roles
Three nologin roles with non-overlapping privs are created by default
* pg_read_all_settings - read all GUCs.
* pg_read_all_stats - pg_stat_*, pg_database_size(), pg_tablespace_size()
* pg_stat_scan_tables - may lock/scan tables

Top level role - pg_monitor includes all of the above by default, plus others

Author: Dave Page
Reviewed-by: Stephen Frost, Robert Haas, Peter Eisentraut, Simon Riggs
2017-03-30 14:18:53 -04:00
Andres Freund
5ded4bd214 Remove support for version-0 calling conventions.
The V0 convention is failure prone because we've so far assumed that a
function is V0 if PG_FUNCTION_INFO_V1 is missing, leading to crashes
if a function was coded against the V1 interface.  V0 doesn't allow
proper NULL, SRF and toast handling.  V0 doesn't offer features that
V1 doesn't.

Thus remove V0 support and obsolete fmgr README contents relating to
it.

Author: Andres Freund, with contributions by Peter Eisentraut & Craig Ringer
Reviewed-By: Peter Eisentraut, Craig Ringer
Discussion: https://postgr.es/m/20161208213441.k3mbno4twhg2qf7g@alap3.anarazel.de
2017-03-30 06:25:46 -07:00
Teodor Sigaev
f90d23d0c5 Implement SortSupport for macaddr data type
Introduces a scheme to produce abbreviated keys for the macaddr type.
Bump catalog version.

Author: Brandur Leach
Reviewed-by: Julien Rouhaud, Peter Geoghegan

https://commitfest.postgresql.org/13/743/
2017-03-29 23:28:56 +03:00
Peter Eisentraut
4fdb8a82e3 Update copyright year in recently added files
Author: Masahiko Sawada <sawada.mshk@gmail.com>
2017-03-29 14:54:10 -04:00
Robert Haas
9a09527164 Mark more functions parallel-restricted.
Commit 61c2e1a95f allowed parallel
query to be used in more places, revealing via buildfarm member
mandrill that several functions intended to be called from triggers
were incorrectly marked parallel-safe rather than parallel-restricted.

Report by Tom Lane.  Patch by Rafia Sabih.  Reviewed by me.

Discussion: http://postgr.es/m/16061.1490479253@sss.pgh.pa.us
2017-03-29 10:59:27 -04:00
Peter Eisentraut
3582b223d4 Fix hardcoded typeof check result for Windows
The test result that I had blindly stipulated didn't work out on the
build farm, so disable the feature in Windows MSVC for now.
2017-03-29 08:55:34 -04:00
Peter Eisentraut
4cb824699e Cast result of copyObject() to correct type
copyObject() is declared to return void *, which allows easily assigning
the result independent of the input, but it loses all type checking.

If the compiler supports typeof or something similar, cast the result to
the input type.  This creates a greater amount of type safety.  In some
cases, where the result is assigned to a generic type such as Node * or
Expr *, new casts are now necessary, but in general casts are now
unnecessary in the normal case and indicate that something unusual is
happening.

Reviewed-by: Mark Dilger <hornschnorter@gmail.com>
2017-03-28 21:59:23 -04:00
Alvaro Herrera
ce96ce60ca Remove direct uses of ItemPointer.{ip_blkid,ip_posid}
There are no functional changes here; this simply encapsulates knowledge
of the ItemPointerData struct so that a future patch can change things
without more breakage.

All direct users of ip_blkid and ip_posid are changed to use existing
macros ItemPointerGetBlockNumber and ItemPointerGetOffsetNumber
respectively.  For callers where that's inappropriate (because they
Assert that the itempointer is is valid-looking), add
ItemPointerGetBlockNumberNoCheck and ItemPointerGetOffsetNumberNoCheck,
which lack the assertion but are otherwise identical.

Author: Pavan Deolasee
Discussion: https://postgr.es/m/CABOikdNnFon4cJiL=h1mZH3bgUeU+sWHuU4Yr8AB=j3A2p1GiA@mail.gmail.com
2017-03-28 19:02:23 -03:00
Teodor Sigaev
ab89e465cb Altering default privileges on schemas
Extend ALTER DEFAULT PRIVILEGES command to schemas.

Author: Matheus Oliveira
Reviewed-by: Petr Jelínek, Ashutosh Sharma

https://commitfest.postgresql.org/13/887/
2017-03-28 18:58:55 +03:00
Simon Riggs
ff539da316 Cleanup slots during drop database
Automatically drop all logical replication slots associated with a
database when the database is dropped. Previously we threw an ERROR
if a slot existed. Now we throw ERROR only if a slot is active in
the database being dropped.

Craig Ringer
2017-03-28 10:05:21 -04:00
Alvaro Herrera
6462238f0d Fix uninitialized memory propagation mistakes
Valgrind complains that some uninitialized bytes are being passed around
by the extended statistics code since commit 7b504eb282, as reported
by Andres Freund.  Silence it.

Tomas Vondra submitted a patch which he verified to fix the complaints
in his machine; however I messed with it a bit before pushing, so any
remaining problems are likely my (Álvaro's) fault.

Author: Tomas Vondra
Discussion: https://postgr.es/m/20170325211031.4xxoptigqxm2emn2@alap3.anarazel.de
2017-03-27 14:52:19 -03:00
Robert Haas
c4c51541e2 Still more code review for single-page hash vacuuming.
Most seriously, fix use of incorrect block ID, per a report from
Jeff Janes that it causes a crash and a diagnosis from Amit Kapila.

Improve consistency between the hash and btree versions of this
code by adding back a PANIC that btree has, and by registering
data in the xlog record in the same way, per complaints from
Jeff Janes and Amit Kapila.

Tidy up some minor cosmetic points, per complaints from Amit
Kapila.

Patch by Ashutosh Sharma, reviewed by Amit Kapila, and tested by
Jeff Janes.

Discussion: http://postgr.es/m/CAMkU=1w-9Qe=Ff1o6bSaXpNO9wqpo7_9GL8_CVhw4BoVVHasqg@mail.gmail.com
2017-03-27 12:51:10 -04:00
Teodor Sigaev
1b02be21f2 Fsync directory after creating or unlinking file.
If file was created/deleted just before powerloss it's possible that
file system will miss that. To prevent it, call fsync() where creating/
unlinkg file is critical.

Author: Michael Paquier
Reviewed-by: Ashutosh Bapat, Takayuki Tsunakawa, me
2017-03-27 19:33:01 +03:00
Andrew Gierth
b5635948ab Support hashed aggregation with grouping sets.
This extends the Aggregate node with two new features: HashAggregate
can now run multiple hashtables concurrently, and a new strategy
MixedAggregate populates hashtables while doing sorted grouping.

The planner will now attempt to save as many sorts as possible when
planning grouping sets queries, while not exceeding work_mem for the
estimated combined sizes of all hashtables used.  No SQL-level changes
are required.  There should be no user-visible impact other than the
new EXPLAIN output and possible changes to result ordering when ORDER
BY was not used (which affected a few regression tests).  The
enable_hashagg option is respected.

Author: Andrew Gierth
Reviewers: Mark Dilger, Andres Freund
Discussion: https://postgr.es/m/87vatszyhj.fsf@news-spur.riddles.org.uk
2017-03-27 04:20:54 +01:00
Robert Haas
fc70a4b0df Show more processes in pg_stat_activity.
Previously, auxiliary processes and background workers not connected
to a database (such as the logical replication launcher) weren't
shown.  Include them, so that we can see the associated wait state
information.  Add a new column to identify the processes type, so that
people can filter them out easily using SQL if they wish.

Before this patch was written, there was discussion about whether we
should expose this information in a separate view, so as to avoid
contaminating pg_stat_activity with things people might not want to
see.  But putting everything in pg_stat_activity was a more popular
choice, so that's what the patch does.

Kuntal Ghosh, reviewed by Amit Langote and Michael Paquier.  Some
revisions and bug fixes by me.

Discussion: http://postgr.es/m/CA+TgmoYES5nhkEGw9nZXU8_FhA8XEm8NTm3-SO+3ML1B81Hkww@mail.gmail.com
2017-03-26 22:02:22 -04:00
Tom Lane
2f0903ea19 Improve performance of ExecEvalWholeRowVar.
In commit b8d7f053c, we needed to fix ExecEvalWholeRowVar to not change
the state of the slot it's copying.  The initial quick hack at that
required two rounds of tuple construction, which is not very nice.
To fix, add another primitive to tuptoaster.c that does precisely what
we need.  (I initially tried to do this by refactoring one of the
existing functions into two pieces; but it looked like that might hurt
performance for the existing case, and the amount of code that could
be shared is not very large, so I gave up on that.)

Discussion: https://postgr.es/m/26088.1490315792@sss.pgh.pa.us
2017-03-26 19:14:57 -04:00
Peter Eisentraut
895f93701f Fix cpluspluscheck warning
Structure tag cannot be the same as a typedef that is a pointer to that
struct.
2017-03-26 18:31:05 -04:00
Tom Lane
5459cfd3ad Fix typos in logical replication support for initial data copy.
Fix an incorrect assert condition (noted by Coverity), and spell the new
name of the function correctly.  Typos introduced in commit 7c4f52409.

Michael Paquier
2017-03-26 17:44:35 -04:00
Andres Freund
b8d7f053c5 Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.

This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.

The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
  function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
  out operation metadata sequentially; including the avoidance of
  nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
  constant re-checks at evaluation time

Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.

The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
  overhead of expression evaluation, by caching state in prepared
  statements.  That'd be helpful in OLTPish scenarios where
  initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
  work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
  been made here too.

The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
  initialization, whereas previously they were done during
  execution. In edge cases this can lead to errors being raised that
  previously wouldn't have been, e.g. a NULL array being coerced to a
  different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
  during expression initialization, previously it was re-built
  every time a domain check was evaluated. For normal queries this
  doesn't change much, but e.g. for plpgsql functions, which caches
  ExprStates, the old set could stick around longer.  The behavior
  around might still change.

Author: Andres Freund, with significant changes by Tom Lane,
	changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-25 14:52:06 -07:00
Simon Riggs
5737c12df0 Report catalog_xmin separately in hot_standby_feedback
If the upstream walsender is using a physical replication slot, store the
catalog_xmin in the slot's catalog_xmin field. If the upstream doesn't use a
slot and has only a PGPROC entry behaviour doesn't change, as we store the
combined xmin and catalog_xmin in the PGPROC entry.

Author: Craig Ringer
2017-03-25 14:07:27 +00:00
Peter Eisentraut
7678fe1c6d Make header self-contained
Add necessary include files for things used in the header.
2017-03-24 23:36:10 -04:00
Simon Riggs
3428ef7911 Reverting 42b4b0b241
Buildfarm issues and other reported issues
2017-03-24 17:56:17 +00:00
Alvaro Herrera
7b504eb282 Implement multivariate n-distinct coefficients
Add support for explicitly declared statistic objects (CREATE
STATISTICS), allowing collection of statistics on more complex
combinations that individual table columns.  Companion commands DROP
STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are
added too.  All this DDL has been designed so that more statistic types
can be added later on, such as multivariate most-common-values and
multivariate histograms between columns of a single table, leaving room
for permitting columns on multiple tables, too, as well as expressions.

This commit only adds support for collection of n-distinct coefficient
on user-specified sets of columns in a single table.  This is useful to
estimate number of distinct groups in GROUP BY and DISTINCT clauses;
estimation errors there can cause over-allocation of memory in hashed
aggregates, for instance, so it's a worthwhile problem to solve.  A new
special pseudo-type pg_ndistinct is used.

(num-distinct estimation was deemed sufficiently useful by itself that
this is worthwhile even if no further statistic types are added
immediately; so much so that another version of essentially the same
functionality was submitted by Kyotaro Horiguchi:
https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp
though this commit does not use that code.)

Author: Tomas Vondra.  Some code rework by Álvaro.
Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes,
    Ideriha Takeshi
Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz
    https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 14:06:10 -03:00
Robert Haas
857ee8e391 Add a txid_status function.
If your connection to the database server is lost while a COMMIT is
in progress, it may be difficult to figure out whether the COMMIT was
successful or not.  This function will tell you, provided that you
don't wait too long to ask.  It may be useful in other situations,
too.

Craig Ringer, reviewed by Simon Riggs and by me

Discussion: http://postgr.es/m/CAMsr+YHQiWNEi0daCTboS40T+V5s_+dst3PYv_8v2wNVH+Xx4g@mail.gmail.com
2017-03-24 12:00:53 -04:00
Simon Riggs
42b4b0b241 Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
For normal commits and aborts we already reset PgXact->xmin
Avoiding touching highly contented shmem improves concurrent
performance.

Simon Riggs

Discussion: CANP8+jJdXE9b+b9F8CQT-LuxxO0PBCB-SZFfMVAdp+akqo4zfg@mail.gmail.com
2017-03-24 14:20:59 +00:00
Heikki Linnakangas
7ac955b347 Allow SCRAM authentication, when pg_hba.conf says 'md5'.
If a user has a SCRAM verifier in pg_authid.rolpassword, there's no reason
we cannot attempt to perform SCRAM authentication instead of MD5. The worst
that can happen is that the client doesn't support SCRAM, and the
authentication will fail. But previously, it would fail for sure, because
we would not even try. SCRAM is strictly more secure than MD5, so there's
no harm in trying it. This allows for a more graceful transition from MD5
passwords to SCRAM, as user passwords can be changed to SCRAM verifiers
incrementally, without changing pg_hba.conf.

Refactor the code in auth.c to support that better. Notably, we now have to
look up the user's pg_authid entry before sending the password challenge,
also when performing MD5 authentication. Also simplify the concept of a
"doomed" authentication. Previously, if a user had a password, but it had
expired, we still performed SCRAM authentication (but always returned error
at the end) using the salt and iteration count from the expired password.
Now we construct a fake salt, like we do when the user doesn't have a
password or doesn't exist at all. That simplifies get_role_password(), and
we can don't need to distinguish the  "user has expired password", and
"user does not exist" cases in auth.c.

On second thoughts, also rename uaSASL to uaSCRAM. It refers to the
mechanism specified in pg_hba.conf, and while we use SASL for SCRAM
authentication at the protocol level, the mechanism should be called SCRAM,
not SASL. As a comparison, we have uaLDAP, even though it looks like the
plain 'password' authentication at the protocol level.

Discussion: https://www.postgresql.org/message-id/6425.1489506016@sss.pgh.pa.us
Reviewed-by: Michael Paquier
2017-03-24 13:32:21 +02:00
Teodor Sigaev
78874531ba Fix backup canceling
Assert-enabled build crashes but without asserts it works by wrong way:
it may not reset forcing full page write and preventing from starting
exclusive backup with the same name as cancelled.
Patch replaces pair of booleans
nonexclusive_backup_running/exclusive_backup_running to single enum to
correctly describe backup state.

Backpatch to 9.6 where bug was introduced

Reported-by: David Steele
Authors: Michael Paquier, David Steele
Reviewed-by: Anastasia Lubennikova

https://commitfest.postgresql.org/13/1068/
2017-03-24 13:53:40 +03:00
Tom Lane
457a444873 Avoid syntax error on platforms that have neither LOCALE_T nor ICU.
Buildfarm member anole sees this union as empty, and doesn't like it.
2017-03-23 23:18:52 -04:00
Robert Haas
c23b186ae9 Fix enum definition.
Commit 249cf070e3 assigned to one of
the labels in the middle the value that should have been assigned
to the first member of the enum.  Rushabh's patch didn't have that
defect as submitted, but I managed to mess it up while editing.
Repair.
2017-03-23 16:10:43 -04:00
Peter Eisentraut
eccfef81e1 ICU support
Add a column collprovider to pg_collation that determines which library
provides the collation data.  The existing choices are default and libc,
and this adds an icu choice, which uses the ICU4C library.

The pg_locale_t type is changed to a union that contains the
provider-specific locale handles.  Users of locale information are
changed to look into that struct for the appropriate handle to use.

Also add a collversion column that records the version of the collation
when it is created, and check at run time whether it is still the same.
This detects potentially incompatible library upgrades that can corrupt
indexes and other structures.  This is currently only supported by
ICU-provided collations.

initdb initializes the default collation set as before from the `locale
-a` output but also adds all available ICU locales with a "-x-icu"
appended.

Currently, ICU-provided collations can only be explicitly named
collations.  The global database locales are still always libc-provided.

ICU support is enabled by configure --with-icu.

Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
2017-03-23 15:28:48 -04:00
Robert Haas
ea42cc18c3 Track the oldest XID that can be safely looked up in CLOG.
This provides infrastructure for looking up arbitrary, user-supplied
XIDs without a risk of scary-looking failures from within the clog
module.  Normally, the oldest XID that can be safely looked up in CLOG
is the same as the oldest XID that can reused without causing
wraparound, and the latter is already tracked.  However, while
truncation is in progress, the values are different, so we must
keep track of them separately.

Craig Ringer, reviewed by Simon Riggs and by me.

Discussion: http://postgr.es/m/CAMsr+YHQiWNEi0daCTboS40T+V5s_+dst3PYv_8v2wNVH+Xx4g@mail.gmail.com
2017-03-23 14:26:31 -04:00
Robert Haas
691b8d5928 Allow for parallel execution whenever ExecutorRun() is done only once.
Previously, it was unsafe to execute a plan in parallel if
ExecutorRun() might be called with a non-zero row count.  However,
it's quite easy to fix things up so that we can support that case,
provided that it is known that we will never call ExecutorRun() a
second time for the same QueryDesc.  Add infrastructure to signal
this, and cross-checks to make sure that a caller who claims this is
true doesn't later reneg.

While that pattern never happens with queries received directly from a
client -- there's no way to know whether multiple Execute messages
will be sent unless the first one requests all the rows -- it's pretty
common for queries originating from procedural languages, which often
limit the result to a single tuple or to a user-specified number of
tuples.

This commit doesn't actually enable parallelism in any additional
cases, because currently none of the places that would be able to
benefit from this infrastructure pass CURSOR_OPT_PARALLEL_OK in the
first place, but it makes it much more palatable to pass
CURSOR_OPT_PARALLEL_OK in places where we currently don't, because it
eliminates some cases where we'd end up having to run the parallel
plan serially.

Patch by me, based on some ideas from Rafia Sabih and corrected by
Rafia Sabih based on feedback from Dilip Kumar and myself.

Discussion: http://postgr.es/m/CA+TgmobXEhvHbJtWDuPZM9bVSLiTj-kShxQJ2uM5GPDze9fRYA@mail.gmail.com
2017-03-23 13:14:36 -04:00
Teodor Sigaev
218f51584d Reduce page locking in GIN vacuum
GIN vacuum during cleaning posting tree can lock this whole tree for a long
time with by holding LockBufferForCleanup() on root. Patch changes it with
two ways: first, cleanup lock will be taken only if there is an empty page
(which should be deleted) and, second, it tries to lock only subtree, not the
whole posting tree.

Author: Andrey Borodin with minor editorization by me
Reviewed-by: Jeff Davis, me

https://commitfest.postgresql.org/13/896/
2017-03-23 19:38:47 +03:00
Peter Eisentraut
73561013e5 Remove trailing comma from enum definition
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
2017-03-23 11:58:11 -04:00
Peter Eisentraut
128e6ee01d Assorted compilation and test fixes
related to 7c4f52409a, per build farm

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
2017-03-23 11:44:43 -04:00
Simon Riggs
6912acc04f Replication lag tracking for walsenders
Adds write_lag, flush_lag and replay_lag cols to pg_stat_replication.

Implements a lag tracker module that reports the lag times based upon
measurements of the time taken for recent WAL to be written, flushed and
replayed and for the sender to hear about it. These times
represent the commit lag that was (or would have been) introduced by each
synchronous commit level, if the remote server was configured as a
synchronous standby.  For an asynchronous standby, the replay_lag column
approximates the delay before recent transactions became visible to queries.
If the standby server has entirely caught up with the sending server and
there is no more WAL activity, the most recently measured lag times will
continue to be displayed for a short time and then show NULL.

Physical replication lag tracking is automatic. Logical replication tracking
is possible but is the responsibility of the logical decoding plugin.
Tracking is a private module operating within each walsender individually,
with values reported to shared memory. Module not used outside of walsender.

Design and code is good enough now to commit - kudos to the author.
In many ways a difficult topic, with important and subtle behaviour so this
shoudl be expected to generate discussion and multiple open items: Test now!

Author: Thomas Munro, following designs by Fujii Masao and Simon Riggs
Review: Simon Riggs, Ian Barwick and Craig Ringer
2017-03-23 14:05:28 +00:00
Peter Eisentraut
7c4f52409a Logical replication support for initial data copy
Add functionality for a new subscription to copy the initial data in the
tables and then sync with the ongoing apply process.

For the copying, add a new internal COPY option to have the COPY source
data provided by a callback function.  The initial data copy works on
the subscriber by receiving COPY data from the publisher and then
providing it locally into a COPY that writes to the destination table.

A WAL receiver can now execute full SQL commands.  This is used here to
obtain information about tables and publications.

Several new options were added to CREATE and ALTER SUBSCRIPTION to
control whether and when initial table syncing happens.

Change pg_dump option --no-create-subscription-slots to
--no-subscription-connect and use the new CREATE SUBSCRIPTION
... NOCONNECT option for that.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Tested-by: Erik Rijkers <er@xs4all.nl>
2017-03-23 08:55:37 -04:00