Commit graph

17473 commits

Author SHA1 Message Date
Tom Lane
fdf87ed451 Fix failure-to-copy bug in commit 6f6b99d13.
The previous coding of get_qual_for_list() was careful to copy everything
it was using from the input data structure.  The new version missed
making a copy of pass-by-ref datum values that it's inserting into Consts.
This is not optional, however, as revealed by buildfarm failures on
machines running -DRELCACHE_FORCE_RELEASE: we're copying from a relcache
entry that could go away before the required lifespan of our output
expression.  I'm pretty sure -DCLOBBER_CACHE_ALWAYS machines won't like
this either, but none of them have reported in yet.
2017-09-08 20:45:31 -04:00
Tom Lane
e56dd7cf50 Fix uninitialized-variable bug.
map_partition_varattnos() failed to set its found_whole_row output
parameter if the given expression list was NIL.  This seems to be
a pre-existing bug that chanced to be exposed by commit 6f6b99d13.
It might be unreachable in v10, but I have little faith in that
proposition, so back-patch.

Per buildfarm.
2017-09-08 19:04:32 -04:00
Robert Haas
6f6b99d133 Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.

Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.

Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 17:28:04 -04:00
Tom Lane
3cf17c9d47 Remove mention of password_encryption = plain in postgresql.conf.sample.
Evidently missed in commit eb61136dc.

Spotted by Oleg Bartunov.

Discussion: https://postgr.es/m/CAF4Au4wz_iK5r4fnTnnd8XqioAZQs-P7-VsEAfivW34zMVpAmw@mail.gmail.com
2017-09-08 14:38:54 -04:00
Robert Haas
f0a0c17c1b Refactor get_partition_for_tuple a bit.
Pending patches for both default partitioning and hash partitioning
find the current coding pattern to be inconvenient.  Change it so that
we switch on the partitioning method first and then do whatever is
needed.

Amul Sul, reviewed by Jeevan Ladhe, with a few adjustments by me.

Discussion: http://postgr.es/m/CAAJ_b97mTb=dG2pv6+1ougxEVZFVnZJajW+0QHj46mEE7WsoOQ@mail.gmail.com
Discussion: http://postgr.es/m/CAOgcT0M37CAztEinpvjJc18EdHfm23fw0EG9-36Ya=+rEFUqaQ@mail.gmail.com
2017-09-07 21:13:42 -04:00
Tom Lane
3ca930fc39 Improve performance of get_actual_variable_range with recently-dead tuples.
In commit fccebe421, we hacked get_actual_variable_range() to scan the
index with SnapshotDirty, so that if there are many uncommitted tuples
at the end of the index range, it wouldn't laboriously scan through all
of them looking for a live value to return.  However, that didn't fix it
for the case of many recently-dead tuples at the end of the index;
SnapshotDirty recognizes those as committed dead and so we're back to
the same problem.

To improve the situation, invent a "SnapshotNonVacuumable" snapshot type
and use that instead.  The reason this helps is that, if the snapshot
rejects a given index entry, we know that the indexscan will mark that
index entry as killed.  This means the next get_actual_variable_range()
scan will proceed past that entry without visiting the heap, making the
scan a lot faster.  We may end up accepting a recently-dead tuple as
being the estimated extremal value, but that doesn't seem much worse than
the compromise we made before to accept not-yet-committed extremal values.

The cost of the scan is still proportional to the number of dead index
entries at the end of the range, so in the interval after a mass delete
but before VACUUM's cleaned up the mess, it's still possible for
get_actual_variable_range() to take a noticeable amount of time, if you've
got enough such dead entries.  But the constant factor is much much better
than before, since all we need to do with each index entry is test its
"killed" bit.

We chose to back-patch commit fccebe421 at the time, but I'm hesitant to
do so here, because this form of the problem seems to affect many fewer
people.  Also, even when it happens, it's less bad than the case fixed
by commit fccebe421 because we don't get the contention effects from
expensive TransactionIdIsInProgress tests.

Dmitriy Sarafannikov, reviewed by Andrey Borodin

Discussion: https://postgr.es/m/05C72CF7-B5F6-4DB9-8A09-5AC897653113@yandex.ru
2017-09-07 19:41:51 -04:00
Peter Eisentraut
1356f78ea9 Reduce excessive dereferencing of function pointers
It is equivalent in ANSI C to write (*funcptr) () and funcptr().  These
two styles have been applied inconsistently.  After discussion, we'll
use the more verbose style for plain function pointer variables, to make
it clear that it's a variable, and the shorter style when the function
pointer is in a struct (s.func() or s->func()), because then it's clear
that it's not a plain function name, and otherwise the excessive
punctuation makes some of those invocations hard to read.

Discussion: https://www.postgresql.org/message-id/f52c16db-14ed-757d-4b48-7ef360b1631d@2ndquadrant.com
2017-09-07 13:56:09 -04:00
Robert Haas
9d71323dac Even if some partitions are foreign, allow tuple routing.
This doesn't allow routing tuple to the foreign partitions themselves,
but it permits tuples to be routed to regular partitions despite the
presence of foreign partitions in the same inheritance hierarchy.

Etsuro Fujita, reviewed by Amit Langote and by me.

Discussion: http://postgr.es/m/bc3db4c1-1693-3b8a-559f-33ad2b50b7ad@lab.ntt.co.jp
2017-09-07 10:58:21 -04:00
Tom Lane
6eb52da394 Fix handling of savepoint commands within multi-statement Query strings.
Issuing a savepoint-related command in a Query message that contains
multiple SQL statements led to a FATAL exit with a complaint about
"unexpected state STARTED".  This is a shortcoming of commit 4f896dac1,
which attempted to prevent such misbehaviors in multi-statement strings;
its quick hack of marking the individual statements as "not top-level"
does the wrong thing in this case, and isn't a very accurate description
of the situation anyway.

To fix, let's introduce into xact.c an explicit model of what happens for
multi-statement Query strings.  This is an "implicit transaction block
in progress" state, which for many purposes works like the normal
TBLOCK_INPROGRESS state --- in particular, IsTransactionBlock returns true,
causing the desired result that PreventTransactionChain will throw error.
But in case of error abort it works like TBLOCK_STARTED, allowing the
transaction to be cancelled without need for an explicit ROLLBACK command.

Commit 4f896dac1 is reverted in toto, so that we go back to treating the
individual statements as "top level".  We could have left it as-is, but
this allows sharpening the error message for PreventTransactionChain
calls inside functions.

Except for getting a normal error instead of a FATAL exit for savepoint
commands, this patch should result in no user-visible behavioral change
(other than that one error message rewording).  There are some things
we might want to do in the line of changing the appearance or wording of
error and warning messages around this behavior, which would be much
simpler to do now that it's an explicitly modeled state.  But I haven't
done them here.

Although this fixes a long-standing bug, no backpatch.  The consequences
of the bug don't seem severe enough to justify the risk that this commit
itself creates some new issue.

Patch by me, but it owes something to previous investigation by
Takayuki Tsunakawa, who also reported the bug in the first place.
Also thanks to Michael Paquier for reviewing.

Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F6BE40D@G01JPEXMBYT05
2017-09-07 09:49:55 -04:00
Simon Riggs
f06588a8e6 Exclude special values in recovery_target_time
recovery_target_time accepts timestamp input, though
does not allow use of special values, e.g. “today”.
Report a useful error message for these cases.

Reported-by: Piotr Stefaniak
Author: Simon Riggs
Discussion: https://postgr.es/m/CANP8+jJdKA+BkkYLWz9zAm16Y0s2ExBv0WfpAwXdTpPfWnA9Bg@mail.gmail.com
2017-09-07 04:56:34 -07:00
Simon Riggs
5b6d13eec7 Allow SET STATISTICS on expression indexes
Index columns are referenced by ordinal number rather than name, e.g.
CREATE INDEX coord_idx ON measured (x, y, (z + t));
ALTER INDEX coord_idx ALTER COLUMN 3 SET STATISTICS 1000;

Incompatibility note for release notes:
\d+ for indexes now also displays Stats Target

Authors: Alexander Korotkov, with contribution by Adrien NAYRAT
Review: Adrien NAYRAT, Simon Riggs
Wordsmith: Simon Riggs
2017-09-06 13:46:01 -07:00
Tom Lane
8689e38263 Clean up handling of dropped columns in NAMEDTUPLESTORE RTEs.
The NAMEDTUPLESTORE patch piggybacked on the infrastructure for
TABLEFUNC/VALUES/CTE RTEs, none of which can ever have dropped columns,
so the possibility was ignored most places.  Fix that, including adding a
specification to parsenodes.h about what it's supposed to look like.

In passing, clean up assorted comments that hadn't been maintained
properly by said patch.

Per bug #14799 from Philippe Beaudoin.  Back-patch to v10.

Discussion: https://postgr.es/m/20170906120005.25630.84360@wrigleys.postgresql.org
2017-09-06 10:41:05 -04:00
Tom Lane
6e427aa4e5 Use lfirst_node() and linitial_node() where appropriate in planner.c.
There's no particular reason to target this module for the first
wholesale application of these macros; but we gotta start somewhere.

Ashutosh Bapat and Jeevan Chalke

Discussion: https://postgr.es/m/CAFjFpRcNr3r=u0ni=7A4GD9NnHQVq+dkFafzqo2rS6zy=dt1eg@mail.gmail.com
2017-09-05 15:57:48 -04:00
Peter Eisentraut
17273d059c Remove unnecessary parentheses in return statements
The parenthesized style has only been used in a few modules.  Change
that to use the style that is predominant across the whole tree.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
Reviewed-by: Ryan Murphy <ryanfmurphy@gmail.com>
2017-09-05 14:52:55 -04:00
Alvaro Herrera
ebd346caf4 Correct base backup throttling
Throttling for sending a base backup in walsender is broken for the case
where there is a lot of WAL traffic, because the latch used to put the
walsender to sleep is also signalled by regular WAL traffic (and each
signal causes an additional batch of data to be sent); the net effect is
that there is no or little actual throttling.  This is undesirable, so
rewrite the sleep into a loop to achieve the desired effeect.

Author: Jeff Janes, small tweaks by me
Reviewed-by: Antonin Houska
Discussion: https://postgr.es/m/CAMkU=1xH6mde-yL-Eo1TKBGNd0PB1-TMxvrNvqcAkN-qr2E9mw@mail.gmail.com
2017-09-05 17:27:30 +02:00
Tom Lane
4faa1dc2eb Suppress compiler warnings in dshash.c.
Some compilers complain, not unreasonably, about left-shifting an
int32 "1" and then assigning the result to an int64.  In practice
I sure hope that this data structure never gets large enough that
an overflow would actually occur; but let's cast the constant to
the right type to avoid the hazard.

In passing, fix a typo in dshash.h.

Amit Kapila, adjusted as per comment from Thomas Munro.

Discussion: https://postgr.es/m/CAA4eK1+5vfVMYtjK_NX8O3-42yM3o80qdqWnQzGquPrbq6mb+A@mail.gmail.com
2017-09-03 11:12:29 -04:00
Tom Lane
51daa7bdb3 Improve division of labor between execParallel.c and nodeGather[Merge].c.
Move the responsibility for creating/destroying TupleQueueReaders into
execParallel.c, to avoid duplicative coding in nodeGather.c and
nodeGatherMerge.c.  Also, instead of having DestroyTupleQueueReader do
shm_mq_detach, do it in the caller (which is now only ExecParallelFinish).
This means execParallel.c does both the attaching and detaching of the
tuple-queue-reader shm_mqs, which seems less weird than the previous
arrangement.

These changes also eliminate a vestigial memory leak (of the pei->tqueue
array).  It's now demonstrable that rescans of Gather or GatherMerge don't
leak memory.

Discussion: https://postgr.es/m/8670.1504192177@sss.pgh.pa.us
2017-09-01 17:39:01 -04:00
Peter Eisentraut
c039ba0716 Add memory info to getrusage output
Add the maxrss field to the getrusage output (log_*_stats).  This was
previously omitted because of portability concerns, but we feel this
might not be a concern anymore.

based on patch by Justin Pryzby <pryzby@telsasoft.com>
2017-09-01 15:36:33 -04:00
Robert Haas
0cb8b7531d Tighten up some code in RelationBuildPartitionDesc.
This probably doesn't save anything meaningful in terms of
performance, but making the code simpler is a good idea anyway.

Code by Beena Emerson, extracted from a larger patch by Jeevan
Ladhe, slightly adjusted by me.

Discussion: http://postgr.es/m/CAOgcT0ONgwajdtkoq+AuYkdTPY9cLWWLjxt_k4SXue3eieAr+g@mail.gmail.com
2017-09-01 15:16:44 -04:00
Robert Haas
baaf272ac9 Use group updates when setting transaction status in clog.
Commit 0e141c0fbb introduced a mechanism
to reduce contention on ProcArrayLock by having a single process clear
XIDs in the procArray on behalf of multiple processes, reducing the
need to hand the lock around.  A previous attempt to introduce a similar
mechanism for CLogControlLock in ccce90b398
crashed and burned, but the design problem which resulted in those
failures is believed to have been corrected in this version.

Amit Kapila, with some cosmetic changes by me.  See the previous commit
message for additional credits.

Discussion: http://postgr.es/m/CAA4eK1KudxzgWhuywY_X=yeSAhJMT4DwCjroV5Ay60xaeB2Eew@mail.gmail.com
2017-09-01 11:45:40 -04:00
Alvaro Herrera
a6979c3a68 Restore behavior for replication origin drop
Do for replication origins what the previous commit did for replication
slots: restore the original behavior of replication origin drop to raise
an error rather than blocking, because users might be depending on the
original behavior.  Maintain the blocking behavior when invoked
internally from logical replication subscription handling.

Discussion: https://postgr.es/m/20170830133922.tlpo3lgfejm4n2cs@alvherre.pgsql
2017-09-01 16:30:02 +02:00
Alvaro Herrera
be7161566d Add a WAIT option to DROP_REPLICATION_SLOT
Commit 9915de6c1c changed the default behavior of
DROP_REPLICATION_SLOT so that it would wait until any session holding
the slot active would release it, instead of raising an error.  But
users are already depending on the original behavior, so revert to it by
default and add a WAIT option to invoke the new behavior.

Per complaint from Simone Gotti, in
Discussion: https://postgr.es/m/CAEvsy6Wgdf90O6pUvg2wSVXL2omH5OPC-38OD4Zzgk-FXavj3Q@mail.gmail.com
2017-09-01 13:44:14 +02:00
Robert Haas
7b69b6ceb8 Fix assorted carelessness about Datum vs. int64 vs. uint64
Bugs introduced by commit 81c5e46c49
2017-09-01 00:14:54 -04:00
Robert Haas
0d9506d125 Try to repair poorly-considered code in previous commit. 2017-08-31 23:09:00 -04:00
Robert Haas
81c5e46c49 Introduce 64-bit hash functions with a 64-bit seed.
This will be useful for hash partitioning, which needs a way to seed
the hash functions to avoid problems such as a hash index on a hash
partitioned table clumping all values into a small portion of the
bucket space; it's also useful for anything that wants a 64-bit hash
value rather than a 32-bit hash value.

Just in case somebody wants a 64-bit hash value that is compatible
with the existing 32-bit hash values, make the low 32-bits of the
64-bit hash value match the 32-bit hash value when the seed is 0.

Robert Haas and Amul Sul

Discussion: http://postgr.es/m/CA+Tgmoafx2yoJuhCQQOL5CocEi-w_uG4S2xT0EtgiJnPGcHW3g@mail.gmail.com
2017-08-31 22:21:21 -04:00
Tom Lane
2d44c58c79 Avoid memory leaks when a GatherMerge node is rescanned.
Rescanning a GatherMerge led to leaking some memory in the executor's
query-lifespan context, because most of the node's working data structures
were simply abandoned and rebuilt from scratch.  In practice, this might
never amount to much, given the cost of relaunching worker processes ---
but it's still pretty messy, so let's fix it.

We can rearrange things so that the tuple arrays are simply cleared and
reused, and we don't need to rebuild the TupleTableSlots either, just
clear them.  One small complication is that because we might get a
different number of workers on each iteration, we can't keep the old
convention that the leader's gm_slots[] entry is the last one; the leader
might clobber a TupleTableSlot that we need for a worker in a future
iteration.  Hence, adjust the logic so that the leader has slot 0 always,
while the active workers have slots 1..n.

Back-patch to v10 to keep all the existing versions of nodeGatherMerge.c
in sync --- because of the renumbering of the slots, there would otherwise
be a very large risk that any future backpatches in this module would
introduce bugs.

Discussion: https://postgr.es/m/8670.1504192177@sss.pgh.pa.us
2017-08-31 16:21:05 -04:00
Robert Haas
30833ba154 Expand partitioned tables in PartDesc order.
Previously, we expanded the inheritance hierarchy in the order in
which find_all_inheritors had locked the tables, but that turns out
to block quite a bit of useful optimization.  For example, a
partition-wise join can't count on two tables with matching bounds
to get expanded in the same order.

Where possible, this change results in expanding partitioned tables in
*bound* order.  Bound order isn't well-defined for a list-partitioned
table with a null-accepting partition or for a list-partitioned table
where the bounds for a single partition are interleaved with other
partitions.  However, when expansion in bound order is possible, it
opens up further opportunities for optimization, such as
strength-reducing MergeAppend to Append when the expansion order
matches the desired sort order.

Patch by me, with cosmetic revisions by Ashutosh Bapat.

Discussion: http://postgr.es/m/CA+TgmoZrKj7kEzcMSum3aXV4eyvvbh9WD=c6m=002WMheDyE3A@mail.gmail.com
2017-08-31 15:50:18 -04:00
Tom Lane
6708e447ef Clean up shm_mq cleanup.
The logic around shm_mq_detach was a few bricks shy of a load, because
(contrary to the comments for shm_mq_attach) all it did was update the
shared shm_mq state.  That left us leaking a bit of process-local
memory, but much worse, the on_dsm_detach callback for shm_mq_detach
was still armed.  That means that whenever we ultimately detach from
the DSM segment, we'd run shm_mq_detach again for already-detached,
possibly long-dead queues.  This accidentally fails to fail today,
because we only ever re-use a shm_mq's memory for another shm_mq, and
multiple detach attempts on the last such shm_mq are fairly harmless.
But it's gonna bite us someday, so let's clean it up.

To do that, change shm_mq_detach's API so it takes a shm_mq_handle
not the underlying shm_mq.  This makes the callers simpler in most
cases anyway.  Also fix a few places in parallel.c that were just
pfree'ing the handle structs rather than doing proper cleanup.

Back-patch to v10 because of the risk that the revenant shm_mq_detach
callbacks would cause a live bug sometime.  Since this is an API
change, it's too late to do it in 9.6.  (We could make a variant
patch that preserves API, but I'm not excited enough to do that.)

Discussion: https://postgr.es/m/8670.1504192177@sss.pgh.pa.us
2017-08-31 15:10:24 -04:00
Tom Lane
04e9678614 Code review for nodeGatherMerge.c.
Comment the fields of GatherMergeState, and organize them a bit more
sensibly.  Comment GMReaderTupleBuffer more usefully too.  Improve
assorted other comments that were obsolete or just not very good English.

Get rid of the use of a GMReaderTupleBuffer for the leader process;
that was confusing, since only the "done" field was used, and that
in a way redundant with need_to_scan_locally.

In gather_merge_init, avoid calling load_tuple_array for
already-known-exhausted workers.  I'm not sure if there's a live bug there,
but the case is unlikely to be well tested due to timing considerations.

Remove some useless code, such as duplicating the tts_isempty test done by
TupIsNull.

Remove useless initialization of ps.qual, replacing that with an assertion
that we have no qual to check.  (If we did, the code would fail to check
it.)

Avoid applying heap_copytuple to a null tuple.  While that fails to crash,
it's confusing and it makes the code less legible not more so IMO.

Propagate a couple of these changes into nodeGather.c, as well.

Back-patch to v10, partly because of the possibility that the
gather_merge_init change is fixing a live bug, but mostly to keep
the branches in sync to ease future bug fixes.
2017-08-30 17:21:08 -04:00
Tom Lane
41b0dd987d Separate reinitialization of shared parallel-scan state from ExecReScan.
Previously, the parallel executor logic did reinitialization of shared
state within the ExecReScan code for parallel-aware scan nodes.  This is
problematic, because it means that the ExecReScan call has to occur
synchronously (ie, during the parent Gather node's ReScan call).  That is
swimming very much against the tide so far as the ExecReScan machinery is
concerned; the fact that it works at all today depends on a lot of fragile
assumptions, such as that no plan node between Gather and a parallel-aware
scan node is parameterized.  Another objection is that because ExecReScan
might be called in workers as well as the leader, hacky extra tests are
needed in some places to prevent unwanted shared-state resets.

Hence, let's separate this code into two functions, a ReInitializeDSM
call and the ReScan call proper.  ReInitializeDSM is called only in
the leader and is guaranteed to run before we start new workers.
ReScan is returned to its traditional function of resetting only local
state, which means that ExecReScan's usual habits of delaying or
eliminating child rescan calls are safe again.

As with the preceding commit 7df2c1f8d, it doesn't seem to be necessary
to make these changes in 9.6, which is a good thing because the FDW and
CustomScan APIs are impacted.

Discussion: https://postgr.es/m/CAA4eK1JkByysFJNh9M349u_nNjqETuEnY_y1VUc_kJiU0bxtaQ@mail.gmail.com
2017-08-30 13:18:16 -04:00
Tom Lane
7df2c1f8da Force rescanning of parallel-aware scan nodes below a Gather[Merge].
The ExecReScan machinery contains various optimizations for postponing
or skipping rescans of plan subtrees; for example a HashAgg node may
conclude that it can re-use the table it built before, instead of
re-reading its input subtree.  But that is wrong if the input contains
a parallel-aware table scan node, since the portion of the table scanned
by the leader process is likely to vary from one rescan to the next.
This explains the timing-dependent buildfarm failures we saw after
commit a2b70c89c.

The established mechanism for showing that a plan node's output is
potentially variable is to mark it as depending on some runtime Param.
Hence, to fix this, invent a dummy Param (one that has a PARAM_EXEC
parameter number, but carries no actual value) associated with each Gather
or GatherMerge node, mark parallel-aware nodes below that node as dependent
on that Param, and arrange for ExecReScanGather[Merge] to flag that Param
as changed whenever the Gather[Merge] node is rescanned.

This solution breaks an undocumented assumption made by the parallel
executor logic, namely that all rescans of nodes below a Gather[Merge]
will happen synchronously during the ReScan of the top node itself.
But that's fundamentally contrary to the design of the ExecReScan code,
and so was doomed to fail someday anyway (even if you want to argue
that the bug being fixed here wasn't a failure of that assumption).
A follow-on patch will address that issue.  In the meantime, the worst
that's expected to happen is that given very bad timing luck, the leader
might have to do all the work during a rescan, because workers think
they have nothing to do, if they are able to start up before the eventual
ReScan of the leader's parallel-aware table scan node has reset the
shared scan state.

Although this problem exists in 9.6, there does not seem to be any way
for it to manifest there.  Without GatherMerge, it seems that a plan tree
that has a rescan-short-circuiting node below Gather will always also
have one above it that will short-circuit in the same cases, preventing
the Gather from being rescanned.  Hence we won't take the risk of
back-patching this change into 9.6.  But v10 needs it.

Discussion: https://postgr.es/m/CAA4eK1JkByysFJNh9M349u_nNjqETuEnY_y1VUc_kJiU0bxtaQ@mail.gmail.com
2017-08-30 09:29:55 -04:00
Robert Haas
bf11e7ee2e Propagate sort instrumentation from workers back to leader.
Up until now, when parallel query was used, no details about the
sort method or space used by the workers were available; details
were shown only for any sorting done by the leader.  Fix that.

Commit 1177ab1dab forced the test case
added by commit 1f6d515a67 to run
without parallelism; now that we have this infrastructure, allow
that again, with a little tweaking to make it pass with and without
force_parallel_mode.

Robert Haas and Tom Lane

Discussion: http://postgr.es/m/CA+Tgmoa2VBZW6S8AAXfhpHczb=Rf6RqQ2br+zJvEgwJ0uoD_tQ@mail.gmail.com
2017-08-29 13:26:33 -04:00
Robert Haas
3452dc5240 Push tuple limits through Gather and Gather Merge.
If we only need, say, 10 tuples in total, then we certainly don't need
more than 10 tuples from any single process.  Pushing down the limit
lets workers exit early when possible.  For Gather Merge, there is
an additional benefit: a Sort immediately below the Gather Merge can
be done as a bounded sort if there is an applicable limit.

Robert Haas and Tom Lane

Discussion: http://postgr.es/m/CA+TgmoYa3QKKrLj5rX7UvGqhH73G1Li4B-EKxrmASaca2tFu9Q@mail.gmail.com
2017-08-29 13:16:55 -04:00
Tom Lane
3f4c7917b3 Code review for pushing LIMIT through subqueries.
Minor improvements for commit 1f6d515a6.  We do not need the (rather
expensive) test for SRFs in the targetlist, because since v10 any
such SRFs would appear in separate ProjectSet nodes.  Also, make the
code look more like the existing cases by turning it into a simple
recursion --- the argument that there might be some performance
benefit to contorting the code seems unfounded to me, especially since
any good compiler should turn the tail-recursion into iteration anyway.

Discussion: http://postgr.es/m/CADE5jYLuugnEEUsyW6Q_4mZFYTxHxaVCQmGAsF0yiY8ZDggi-w@mail.gmail.com
2017-08-25 09:05:17 -04:00
Andres Freund
d7694fc148 Consolidate the function pointer types used by dshash.c.
Commit 8c0d7bafad introduced dshash with hash
and compare functions like DynaHash's, and also variants that take a user
data pointer instead of size.  Simplify the interface by merging them into
a single pair of function pointer types that take both size and a user data
pointer.

Since it is anticipated that memcmp and tag_hash behavior will be a common
requirement, provide wrapper functions dshash_memcmp and dshash_memhash that
conform to the new function types.

Author: Thomas Munro
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/20170823054644.efuzftxjpfi6wwqs%40alap3.anarazel.de
2017-08-24 17:01:36 -07:00
Andres Freund
4569715bd6 Fix unlikely shared memory leak after failure in dshash_create().
Tidy-up for commit 8c0d7bafad, based on a
complaint from Andres Freund.

Author: Thomas Munro
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/20170823054644.efuzftxjpfi6wwqs%40alap3.anarazel.de
2017-08-24 16:58:30 -07:00
Andres Freund
20fbf25533 Fix harmless thinko in dsa.c.
Commit 16be2fd100 added DSA_ALLOC_HUGE,
DSA_ALLOC_ZERO and DSA_ALLOC_NO_OOM which have the same numerical
values and meanings as the similarly named MCXT_... macros.  In one
place we accidentally used MCXT_ALLOC_NO_OOM when DSA_ALLOC_NO_OOM is
wanted, so tidy that up.

Author: Thomas Munro
Discussion: http://postgr.es/m/CAEepm=2AimHxVkkxnMfQvbZMkXy0uKbVa0-D38c5-qwrCm4CMQ@mail.gmail.com
Backpatch: 10, where dsa was introduced.
2017-08-24 15:07:40 -07:00
Peter Eisentraut
1e1b01cd16 Fix outdated comment
Author: Thomas Munro <thomas.munro@enterprisedb.com>
2017-08-23 14:19:35 -04:00
Peter Eisentraut
237a0b87b1 Improve plural handling in error message
This does not use the normal plural handling, because no numbers appear
in the actual message.
2017-08-23 13:56:59 -04:00
Peter Eisentraut
85f4d6393d Tweak some SCRAM error messages and code comments
Clarify/correct some error messages, fix up some code comments that
confused SASL and SCRAM, and other minor fixes.  No changes in
functionality.
2017-08-23 12:29:38 -04:00
Peter Eisentraut
580ddcec39 Fix translation marker
This was erroneously removed in
55a70a023c.
2017-08-23 09:56:38 -04:00
Andres Freund
8c0d7bafad Hash tables backed by DSA shared memory.
Add general purpose chaining hash tables for DSA memory.  Unlike
DynaHash in shared memory mode, these hash tables can grow as
required, and cope with being mapped into different addresses in
different backends.

There is a wide range of potential users for such a hash table, though
it's very likely the interface will need to evolve as we come to
understand the needs of different kinds of users.  E.g support for
iterators and incremental resizing is planned for later commits and
the details of the callback signatures are likely to change.

Author: Thomas Munro
Reviewed-By: John Gorman, Andres Freund, Dilip Kumar, Robert Haas
Discussion:
	https://postgr.es/m/CAEepm=3d8o8XdVwYT6O=bHKsKAM2pu2D6sV1S_=4d+jStVCE7w@mail.gmail.com
	https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
2017-08-22 22:43:07 -07:00
Andres Freund
35ea75632a Refactor typcache.c's record typmod hash table.
Previously, tuple descriptors were stored in chains keyed by a fixed size
array of OIDs.  That meant there were effectively two levels of collision
chain -- one inside and one outside the hash table.  Instead, let dynahash.c
look after conflicts for us by supplying a proper hash and equal function
pair.

This is a nice cleanup on its own, but also simplifies followup
changes allowing blessed TupleDescs to be shared between backends
participating in parallel query.

Author: Thomas Munro
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAEepm%3D34GVhOL%2BarUx56yx7OPk7%3DqpGsv3CpO54feqjAwQKm5g%40mail.gmail.com
2017-08-22 16:11:54 -07:00
Peter Eisentraut
2bfd1b1ee5 Don't install ICU collation keyword variants
Users can still create them themselves.  Instead, document Unicode TR 35
collation options for ICU, so users can create all this themselves.

Reviewed-by: Peter Geoghegan <pg@bowt.ie>
2017-08-21 19:21:07 -04:00
Peter Eisentraut
51e225da30 Expand set of predefined ICU locales
Install language+region combinations even if they are not distinct from
the language's base locale.  This gives better long-term stability of
the set of predefined locales and makes the predefined locales less
implementation-dependent and more practical for users.

Reviewed-by: Peter Geoghegan <pg@bowt.ie>
2017-08-21 19:21:07 -04:00
Robert Haas
1f6d515a67 Push limit through subqueries to underlying sort, where possible.
Douglas Doole, reviewed by Ashutosh Bapat and by me.  Minor formatting
change by me.

Discussion: http://postgr.es/m/CADE5jYLuugnEEUsyW6Q_4mZFYTxHxaVCQmGAsF0yiY8ZDggi-w@mail.gmail.com
2017-08-21 14:19:44 -04:00
Robert Haas
79ccd7cbd5 pg_prewarm: Add automatic prewarm feature.
Periodically while the server is running, and at shutdown, write out a
list of blocks in shared buffers.  When the server reaches consistency
-- unfortunatey, we can't do it before that point without breaking
things -- reload those blocks into any still-unused shared buffers.

Mithun Cy and Robert Haas, reviewed and tested by Beena Emerson,
Amit Kapila, Jim Nasby, and Rafia Sabih.

Discussion: http://postgr.es/m/CAD__OugubOs1Vy7kgF6xTjmEqTR4CrGAv8w+ZbaY_+MZeitukw@mail.gmail.com
2017-08-21 14:17:39 -04:00
Noah Misch
66ed3829df Inject $(ICU_LIBS) regardless of platform.
It appeared in a conditional that excludes AIX, Cygwin and MinGW.  Give
ICU support a chance to work on those platforms.  Back-patch to v10,
where ICU support was introduced.
2017-08-20 21:22:18 -07:00
Andres Freund
c6293249dc Partially flatten struct tupleDesc so that it can be used in DSM.
TupleDesc's attributes were already stored in contiguous memory after the
struct.  Go one step further and get rid of the array of pointers to
attributes so that they can be stored in shared memory mapped at different
addresses in each backend.  This won't work for TupleDescs with contraints
and defaults, since those point to other objects, but for many purposes
only attributes are needed.

Author: Thomas Munro
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
2017-08-20 11:19:12 -07:00
Andres Freund
2cd7084524 Change tupledesc->attrs[n] to TupleDescAttr(tupledesc, n).
This is a mechanical change in preparation for a later commit that
will change the layout of TupleDesc.  Introducing a macro to abstract
the details of where attributes are stored will allow us to change
that in separate step and revise it in future.

Author: Thomas Munro, editorialized by Andres Freund
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
2017-08-20 11:19:07 -07:00