Commit graph

9405 commits

Author SHA1 Message Date
Fujii Masao
b5310e4ff6 Remove non-fast promotion.
When fast promotion was supported in 9.3, non-fast promotion became
undocumented feature and it's basically not available for ordinary users.
However we decided not to remove non-fast promotion at that moment,
to leave it for a release or two for debugging purpose or as an emergency
method because fast promotion might have some issues, and then to
remove it later. Now, several versions were released since that decision
and there is no longer reason to keep supporting non-fast promotion.
Therefore this commit removes non-fast promotion.

Author: Fujii Masao
Reviewed-by: Hamid Akhtar, Kyotaro Horiguchi
Discussion: https://postgr.es/m/76066434-648f-f567-437b-54853b43398f@oss.nttdata.com
2020-07-29 21:24:26 +09:00
Jeff Davis
9878b643f3 HashAgg: use better cardinality estimate for recursive spilling.
Use HyperLogLog to estimate the group cardinality in a spilled
partition. This estimate is used to choose the number of partitions if
we recurse.

The previous behavior was to use the number of tuples in a spilled
partition as the estimate for the number of groups, which lead to
overpartitioning. That could cause the number of batches to be much
higher than expected (with each batch being very small), which made it
harder to interpret EXPLAIN ANALYZE results.

Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/a856635f9284bc36f7a77d02f47bbb6aaf7b59b3.camel@j-davis.com
Backpatch-through: 13
2020-07-28 23:16:28 -07:00
Thomas Munro
cb04ad4985 Move syncscan.c to src/backend/access/common.
Since the tableam.c code needs to make use of the syncscan.c routines
itself, and since other block-oriented AMs might also want to use it one
day, it didn't make sense for it to live under src/backend/access/heap.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKGLCnG%3DNEAByg6bk%2BCT9JZD97Y%3DAxKhh27Su9FeGWOKvDg%40mail.gmail.com
2020-07-29 16:59:33 +12:00
Amit Kapila
45fdc9738b Extend the logical decoding output plugin API with stream methods.
This adds seven methods to the output plugin API, adding support for
streaming changes of large in-progress transactions.

* stream_start
* stream_stop
* stream_abort
* stream_commit
* stream_change
* stream_message
* stream_truncate

Most of this is a simple extension of the existing methods, with
the semantic difference that the transaction (or subtransaction)
is incomplete and may be aborted later (which is something the
regular API does not really need to deal with).

This also extends the 'test_decoding' plugin, implementing these
new stream methods.

The stream_start/start_stop are used to demarcate a chunk of changes
streamed for a particular toplevel transaction.

This commit simply adds these new APIs and the upcoming patch to "allow
the streaming mode in ReorderBuffer" will use these APIs.

Author: Tomas Vondra, Dilip Kumar, Amit Kapila
Reviewed-by: Amit Kapila
Tested-by: Neha Sharma and Mahendra Singh Thalor
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2020-07-28 08:09:44 +05:30
Peter Geoghegan
bcbf9446a2 Remove hashagg_avoid_disk_plan GUC.
Note: This GUC was originally named enable_hashagg_disk when it appeared
in commit 1f39bce0, which added disk-based hash aggregation.  It was
subsequently renamed in commit 92c58fd9.

Author: Peter Geoghegan
Reviewed-By: Jeff Davis, Álvaro Herrera
Discussion: https://postgr.es/m/9d9d1e1252a52ea1bad84ea40dbebfd54e672a0f.camel%40j-davis.com
Backpatch: 13-, where disk-based hash aggregation was introduced.
2020-07-27 17:53:19 -07:00
Jeff Davis
200f6100a9 Fix LookupTupleHashEntryHash() pipeline-stall issue.
Refactor hash lookups in nodeAgg.c to improve performance.

Author: Andres Freund and Jeff Davis
Discussion: https://postgr.es/m/20200612213715.op4ye4q7gktqvpuo%40alap3.anarazel.de
Backpatch-through: 13
2020-07-26 15:09:46 -07:00
David Rowley
56788d2156 Allocate consecutive blocks during parallel seqscans
Previously we would allocate blocks to parallel workers during a parallel
sequential scan 1 block at a time.  Since other workers were likely to
request a block before a worker returns for another block number to work
on, this could lead to non-sequential I/O patterns in each worker which
could cause the operating system's readahead to perform poorly or not at
all.

Here we change things so that we allocate consecutive "chunks" of blocks
to workers and have them work on those until they're done, at which time
we allocate another chunk for the worker.  The size of these chunks is
based on the size of the relation.

Initial patch here was by Thomas Munro which showed some good improvements
just having a fixed chunk size of 64 blocks with a simple ramp-down near
the end of the scan. The revisions of the patch to make the chunk size
based on the relation size and the adjusted ramp-down in powers of two was
done by me, along with quite extensive benchmarking to determine the
optimal chunk sizes.

For the most part, benchmarks have shown significant performance
improvements for large parallel sequential scans on Linux, FreeBSD and
Windows using SSDs.  It's less clear how this affects the performance of
cloud providers.  Tests done so far are unable to obtain stable enough
performance to provide meaningful benchmark results.  It is possible that
this could cause some performance regressions on more obscure filesystems,
so we may need to later provide users with some ability to get something
closer to the old behavior.  For now, let's leave that until we see that
it's really required.

Author: Thomas Munro, David Rowley
Reviewed-by: Ranier Vilela, Soumyadeep Chakraborty, Robert Haas
Reviewed-by: Amit Kapila, Kirk Jamison
Discussion: https://postgr.es/m/CA+hUKGJ_EErDv41YycXcbMbCBkztA34+z1ts9VQH+ACRuvpxig@mail.gmail.com
2020-07-26 21:02:45 +12:00
Tom Lane
8a37951eeb Mark built-in coercion functions as leakproof where possible.
Making these leakproof seems helpful since (for example) if you have a
function f(int8) that is leakproof, you don't want it to effectively
become non-leakproof when you apply it to an int4 or int2 column.
But that's what happens today, since the implicit up-coercion will
not be leakproof.

Most of the coercion functions that visibly can't throw errors are
functions that convert numeric datatypes to other, wider ones.
Notable is that float4_numeric and float8_numeric can be marked
leakproof; before commit a57d312a7 they could not have been.
I also marked the functions that coerce strings to "name" as leakproof;
that's okay today because they truncate silently, but if we ever
reconsidered that behavior then they could no longer be leakproof.

I desisted from marking rtrim1() as leakproof; it appears so right now,
but the code seems a little too complex and perhaps subject to change,
since it's shared with other SQL functions.

Discussion: https://postgr.es/m/459322.1595607431@sss.pgh.pa.us
2020-07-25 12:54:58 -04:00
Tom Lane
79d6d1a277 Replace TS_execute's TS_EXEC_CALC_NOT flag with TS_EXEC_SKIP_NOT.
It's fairly silly that ignoring NOT subexpressions is TS_execute's
default behavior.  It's wrong on its face and it encourages errors
of omission.  Moreover, the only two remaining callers that aren't
specifying CALC_NOT are in ts_headline calculations, and it's very
arguable that those are bugs: if you've specified "!foo" in your
query, why would you want to get a headline that includes "foo"?

Hence, rip that out and change the default behavior to be to calculate
NOT accurately.  As a concession to the slim chance that there is still
somebody somewhere who needs the incorrect behavior, provide a new
SKIP_NOT flag to explicitly request that.

Back-patch into v13, mainly because it seems better to change this
at the same time as the previous commit's rejiggering of TS_execute
related APIs.  Any outside callers affected by this change are
probably also affected by that one.

Discussion: https://postgr.es/m/CALT9ZEE-aLotzBg-pOp2GFTesGWVYzXA3=mZKzRDa_OKnLF7Mg@mail.gmail.com
2020-07-24 15:43:56 -04:00
Tom Lane
2f2007fbb2 Fix assorted bugs by changing TS_execute's callback API to ternary logic.
Text search sometimes failed to find valid matches, for instance
'!crew:A'::tsquery might fail to locate 'crew:1B'::tsvector during
an index search.  The root of the issue is that TS_execute's callback
functions were not changed to use ternary (yes/no/maybe) reporting
when we made the search logic itself do so.  It's somewhat annoying
to break that API, but on the other hand we now see that any code
using plain boolean logic is almost certainly broken since the
addition of phrase search.  There seem to be very few outside callers
of this code anyway, so we'll just break them intentionally to get
them to adapt.

This allows removal of tsginidx.c's private re-implementation of
TS_execute, since that's now entirely duplicative.  It's also no
longer necessary to avoid use of CALC_NOT in tsgistidx.c, since
the underlying callbacks can now do something reasonable.

Back-patch into v13.  We can't change this in stable branches,
but it seems not quite too late to fix it in v13.

Tom Lane and Pavel Borisov

Discussion: https://postgr.es/m/CALT9ZEE-aLotzBg-pOp2GFTesGWVYzXA3=mZKzRDa_OKnLF7Mg@mail.gmail.com
2020-07-24 15:26:51 -04:00
Peter Eisentraut
25244b8972 Rename configure.in to configure.ac
The new name has been preferred by Autoconf for a long time.  Future
versions of Autoconf will warn about the old name.

Discussion: https://www.postgresql.org/message-id/flat/e796c185-5ece-8569-248f-dd3799701be1%402ndquadrant.com
2020-07-24 10:42:08 +02:00
Amit Kapila
c55040ccd0 WAL Log invalidations at command end with wal_level=logical.
When wal_level=logical, write invalidations at command end into WAL so
that decoding can use this information.

This patch is required to allow the streaming of in-progress transactions
in logical decoding.  The actual work to allow streaming will be committed
as a separate patch.

We still add the invalidations to the cache and write them to WAL at
commit time in RecordTransactionCommit(). This uses the existing
XLOG_INVALIDATIONS xlog record type, from the RM_STANDBY_ID resource
manager (see LogStandbyInvalidations for details).

So existing code relying on those invalidations (e.g. redo) does not need
to be changed.

The invalidations written at command end uses a new xlog record type
XLOG_XACT_INVALIDATIONS, from RM_XACT_ID resource manager. See
LogLogicalInvalidations for details.

These new xlog records are ignored by existing redo procedures, which
still rely on the invalidations written to commit records.

The invalidations are decoded and accumulated in top-transaction, and then
executed during replay.  This obviates the need to decode the
invalidations as part of a commit record.

Bump XLOG_PAGE_MAGIC, since this introduces XLOG_XACT_INVALIDATIONS.

Author: Dilip Kumar, Tomas Vondra, Amit Kapila
Reviewed-by: Amit Kapila
Tested-by: Neha Sharma and Mahendra Singh Thalor
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2020-07-23 08:34:48 +05:30
Tom Lane
a57d312a77 Support infinity and -infinity in the numeric data type.
Add infinities that behave the same as they do in the floating-point
data types.  Aside from any intrinsic usefulness these may have,
this closes an important gap in our ability to convert floating
values to numeric and/or replace float-based APIs with numeric.

The new values are represented by bit patterns that were formerly
not used (although old code probably would take them for NaNs).
So there shouldn't be any pg_upgrade hazard.

Patch by me, reviewed by Dean Rasheed and Andrew Gierth

Discussion: https://postgr.es/m/606717.1591924582@sss.pgh.pa.us
2020-07-22 19:19:44 -04:00
Michael Paquier
e47c2602aa Fix comment in sha2.h
An incorrect reference to SHA-1 was present.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/FE26C953-FA87-4BB9-9105-AA1F8705B0D0@yesql.se
2020-07-22 10:16:21 +09:00
Peter Geoghegan
4a70f829d8 Add nbtree Valgrind buffer lock checks.
Holding just a buffer pin (with no buffer lock) on an nbtree buffer/page
provides very weak guarantees, especially compared to heapam, where it's
often safe to read a page while only holding a buffer pin.  This commit
has Valgrind enforce the following rule: it is never okay to access an
nbtree buffer without holding both a pin and a lock on the buffer.

A draft version of this patch detected questionable code that was
cleaned up by commits fa7ff642 and 7154aa16.  The code in question used
to access an nbtree buffer page's special/opaque area with no buffer
lock (only a buffer pin).  This practice (which isn't obviously unsafe)
is hereby formally disallowed in nbtree.  There doesn't seem to be any
reason to allow it, and banning it keeps things simple for Valgrind.

The new checks are implemented by adding custom nbtree client requests
(located in LockBuffer() wrapper functions); these requests are
"superimposed" on top of the generic bufmgr.c Valgrind client requests
added by commit 1e0dfd16.  No custom resource management cleanup code is
needed to undo the effects of marking buffers as non-accessible under
this scheme.

Author: Peter Geoghegan
Reviewed-By: Anastasia Lubennikova, Georgios Kokolatos
Discussion: https://postgr.es/m/CAH2-WzkLgyN3zBvRZ1pkNJThC=xi_0gpWRUb_45eexLH1+k2_Q@mail.gmail.com
2020-07-21 15:50:58 -07:00
Tom Lane
fc032bed2f Be more careful about marking catalog columns NOT NULL by default.
The bug fixed in commit 72eab84a5 would not have occurred if initdb
had a less surprising rule about which columns should be marked
NOT NULL by default.  Let's make that rule be strictly that the
column must be fixed-width and its predecessors must be fixed-width
and NOT NULL, removing the hacky and unsafe exceptions for oidvector
and int2vector.

Since we do still want all existing oidvector and int2vector columns
to be marked NOT NULL, we have to put BKI_FORCE_NOT_NULL labels on
them.  But making this less magic and more documented seems like a
good idea, even if it's a shade more verbose.

I didn't bump catversion since the initial catalog contents are
not actually changed by this patch.  Note however that the
contents of postgres.bki do change, and feeding an old copy of
that to a new backend will produce wrong results.

Discussion: https://postgr.es/m/204760.1595181800@sss.pgh.pa.us
2020-07-21 13:03:48 -04:00
Tom Lane
4fb6aeb4f6 Make floating-point "NaN / 0" return NaN instead of raising an error.
This is more consistent with the IEEE 754 spec and our treatment of
NaNs elsewhere; in particular, the case has always acted that way in
"numeric" arithmetic.

Noted by Dean Rasheed.

Discussion: https://postgr.es/m/3421746.1594927785@sss.pgh.pa.us
2020-07-20 19:44:45 -04:00
Tom Lane
0fa0b487b5 Correctly mark pg_subscription_rel.srsublsn as nullable.
The code has always set this column to NULL when it's not valid,
but the catalog header's description failed to reflect that,
as did the SGML docs, as did some of the code.  To prevent future
coding errors of the same ilk, let's hide the field from C code
as though it were variable-length (which, in a sense, it is).

As with commit 72eab84a5, we can only fix this cleanly in HEAD
and v13; the problem extends further back but we'll need some
klugery in the released branches.

Discussion: https://postgr.es/m/367660.1595202498@sss.pgh.pa.us
2020-07-20 14:55:56 -04:00
Tom Lane
d5daae47db Fix construction of updated-columns bitmap in logical replication.
Commit b9c130a1f failed to apply the publisher-to-subscriber column
mapping while checking which columns were updated.  Perhaps less
significantly, it didn't exclude dropped columns either.  This could
result in an incorrect updated-columns bitmap and thus wrong decisions
about whether to fire column-specific triggers on the subscriber while
applying updates.  In HEAD (since commit 9de77b545), it could also
result in accesses off the end of the colstatus array, as detected by
buildfarm member skink.  Fix the logic, and adjust 003_constraints.pl
so that the problem is exposed in unpatched code.

In HEAD, also add some assertions to check that we don't access off
the ends of these newly variable-sized arrays.

Back-patch to v10, as b9c130a1f was.

Discussion: https://postgr.es/m/CAH2-Wz=79hKQ4++c5A060RYbjTHgiYTHz=fw6mptCtgghH2gJA@mail.gmail.com
2020-07-20 13:40:16 -04:00
Fujii Masao
c3fe108c02 Rename wal_keep_segments to wal_keep_size.
max_slot_wal_keep_size that was added in v13 and wal_keep_segments are
the GUC parameters to specify how much WAL files to retain for
the standby servers. While max_slot_wal_keep_size accepts the number of
bytes of WAL files, wal_keep_segments accepts the number of WAL files.
This difference of setting units between those similar parameters could
be confusing to users.

To alleviate this situation, this commit renames wal_keep_segments to
wal_keep_size, and make users specify the WAL size in it instead of
the number of WAL files.

There was also the idea to rename max_slot_wal_keep_size to
max_slot_wal_keep_segments, in the discussion. But we have been moving
away from measuring in segments, for example, checkpoint_segments was
replaced by max_wal_size. So we concluded to rename wal_keep_segments
to wal_keep_size.

Back-patch to v13 where max_slot_wal_keep_size was added.

Author: Fujii Masao
Reviewed-by: Álvaro Herrera, Kyotaro Horiguchi, David Steele
Discussion: https://postgr.es/m/574b4ea3-e0f9-b175-ead2-ebea7faea855@oss.nttdata.com
2020-07-20 13:30:18 +09:00
Amit Kapila
0bead9af48 Immediately WAL-log subtransaction and top-level XID association.
The logical decoding infrastructure needs to know which top-level
transaction the subxact belongs to, in order to decode all the
changes. Until now that might be delayed until commit, due to the
caching (GPROC_MAX_CACHED_SUBXIDS), preventing features requiring
incremental decoding.

So we also write the assignment info into WAL immediately, as part
of the next WAL record (to minimize overhead) only when wal_level=logical.
We can not remove the existing XLOG_XACT_ASSIGNMENT WAL as that is
required for avoiding overflow in the hot standby snapshot.

Bump XLOG_PAGE_MAGIC, since this introduces XLR_BLOCK_ID_TOPLEVEL_XID.

Author: Tomas Vondra, Dilip Kumar, Amit Kapila
Reviewed-by: Amit Kapila
Tested-by: Neha Sharma and Mahendra Singh Thalor
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2020-07-20 08:48:26 +05:30
Fujii Masao
d05b172a76 Add generic_plans and custom_plans fields into pg_prepared_statements.
There was no easy way to find how many times generic and custom plans
have been executed for a prepared statement. This commit exposes those
numbers of times in pg_prepared_statements view.

Author: Atsushi Torikoshi, Kyotaro Horiguchi
Reviewed-by: Tatsuro Yamada, Masahiro Ikeda, Fujii Masao
Discussion: https://postgr.es/m/CACZ0uYHZ4M=NZpofH6JuPHeX=__5xcDELF8hT8_2T+R55w4RQw@mail.gmail.com
2020-07-20 11:55:50 +09:00
Tom Lane
72eab84a56 Correctly mark pg_subscription.subslotname as nullable.
Due to the layout of this catalog, subslotname has to be explicitly
marked BKI_FORCE_NULL, else initdb will default to the assumption
that it's non-nullable.  Since, in fact, CREATE/ALTER SUBSCRIPTION
will store null values there, the existing marking is just wrong,
and has been since this catalog was invented.

We haven't noticed because not much in the system actually depends
on attnotnull being truthful.  However, JIT'ed tuple deconstruction
does depend on that in some cases, allowing crashes or wrong answers
in queries that inspect pg_subscription.  Commit 9de77b545 quite
accidentally exposed this on the buildfarm members that force JIT
activation.

Back-patch to v13.  The problem goes further back, but we cannot
force initdb in released branches, so some klugier solution will
be needed there.  Before working on that, push this simple fix
to try to get the buildfarm back to green.

Discussion: https://postgr.es/m/4118109.1595096139@sss.pgh.pa.us
2020-07-19 12:37:23 -04:00
Peter Eisentraut
4d3db13621 Define OPENSSL_API_COMPAT
This avoids deprecation warnings from newer OpenSSL versions (3.0.0 in
particular).

Discussion: https://www.postgresql.org/message-id/flat/FEF81714-D479-4512-839B-C769D2605F8A%40yesql.se
2020-07-19 12:14:42 +02:00
Tom Lane
9b14280b20 Fix replication/worker_internal.h to compile without other headers.
This header hasn't changed recently, so the fact that it now fails
headerscheck/cpluspluscheck testing must be due to changes in what
it includes.  Probably f21916791 is to blame, but I didn't try to
verify that.

Discussion: https://postgr.es/m/3699703.1595016554@sss.pgh.pa.us
2020-07-18 14:58:18 -04:00
Tom Lane
9de77b5453 Allow logical replication to transfer data in binary format.
This patch adds a "binary" option to CREATE/ALTER SUBSCRIPTION.
When that's set, the publisher will send data using the data type's
typsend function if any, rather than typoutput.  This is generally
faster, if slightly less robust.

As committed, we won't try to transfer user-defined array or composite
types in binary, for fear that type OIDs won't match at the subscriber.
This might be changed later, but it seems like fit material for a
follow-on patch.

Dave Cramer, reviewed by Daniel Gustafsson, Petr Jelinek, and others;
adjusted some by me

Discussion: https://postgr.es/m/CADK3HH+R3xMn=8t3Ct+uD+qJ1KD=Hbif5NFMJ+d5DkoCzp6Vgw@mail.gmail.com
2020-07-18 12:44:51 -04:00
Peter Geoghegan
1e0dfd166b Add Valgrind buffer access instrumentation.
Teach Valgrind memcheck to maintain the "defined-ness" of each shared
buffer based on whether the backend holds at least one pin at the point
it is accessed by access method code.  Bugs like the one fixed by commit
b0229f26 can be detected using this new instrumentation.

Note that backends running with Valgrind naturally have their own
independent ideas about whether any given byte in shared memory is safe
or unsafe to access.  There is no risk that concurrent access by
multiple backends to the same shared memory will confuse Valgrind's
instrumentation, because everything already works at the process level
(or at the memory mapping level, if you prefer).

Author: Álvaro Herrera, Peter Geoghegan
Reviewed-By: Anastasia Lubennikova
Discussion: https://postgr.es/m/20150723195349.GW5596@postgresql.org
Discussion: https://postgr.es/m/CAH2-WzkLgyN3zBvRZ1pkNJThC=xi_0gpWRUb_45eexLH1+k2_Q@mail.gmail.com
2020-07-17 17:49:45 -07:00
Thomas Munro
cdc7169509 Use MinimalTuple for tuple queues.
This representation saves 8 bytes per tuple compared to HeapTuple, and
avoids the need to allocate, copy and free on the receiving side.

Gather can emit the returned MinimalTuple directly, but GatherMerge now
needs to make an explicit copy because it buffers multiple tuples at a
time.  That should be no worse than before.

Reviewed-by: Soumyadeep Chakraborty <soumyadeep2007@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2B8T_ggoUTAE-U%3DA%2BOcPc4%3DB0nPPHcSfffuQhvXXjML6w%40mail.gmail.com
2020-07-17 15:04:16 +12:00
Thomas Munro
d2bddc2500 Add huge_page_size setting for use on Linux.
This allows the huge page size to be set explicitly.  The default is 0,
meaning it will use the system default, as before.

Author: Odin Ugedal <odin@ugedal.com>
Discussion: https://postgr.es/m/20200608154639.20254-1-odin%40ugedal.com
2020-07-17 14:33:00 +12:00
Michael Paquier
2a10fdc430 Eliminate cache lookup errors in SQL functions for object addresses
When using the following functions, users could see various types of
errors of the type "cache lookup failed for OID XXX" with elog(), that
can only be used for internal errors:
* pg_describe_object()
* pg_identify_object()
* pg_identify_object_as_address()

The set of APIs managing object addresses for all object types are made
smarter by gaining a new argument "missing_ok" that allows any caller to
control if an error is raised or not on an undefined object.  The SQL
functions listed above are changed to handle the case where an object is
missing.

Regression tests are added for all object types for the cases where
these are undefined.  Before this commit, these cases failed with cache
lookup errors, and now they basically return NULL (minus the name of the
object type requested).

Author: Michael Paquier
Reviewed-by: Aleksander Alekseev, Dmitry Dolgov, Daniel Gustafsson,
Álvaro Herrera, Kyotaro Horiguchi
Discussion: https://postgr.es/m/CAB7nPqSZxrSmdHK-rny7z8mi=EAFXJ5J-0RbzDw6aus=wB5azQ@mail.gmail.com
2020-07-15 09:03:10 +09:00
Peter Eisentraut
de8feb1f3a Fix -Wcast-function-type warnings
Three groups of issues needed to be addressed:

load_external_function() and related functions returned PGFunction,
even though not necessarily all callers are looking for a function of
type PGFunction.  Since these functions are really just wrappers
around dlsym(), change to return void * just like dlsym().

In dynahash.c, we are using strlcpy() where a function with a
signature like memcpy() is expected.  This should be safe, as the new
comment there explains, but the cast needs to be augmented to avoid
the warning.

In PL/Python, methods all need to be cast to PyCFunction, per Python
API, but this now runs afoul of these warnings.  (This issue also
exists in core CPython.)

To fix the second and third case, we add a new type pg_funcptr_t that
is defined specifically so that gcc accepts it as a special function
pointer that can be cast to any other function pointer without the
warning.

Also add -Wcast-function-type to the standard warning flags, subject
to configure check.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/1e97628e-6447-b4fd-e230-d109cec2d584%402ndquadrant.com
2020-07-14 19:55:25 +02:00
Michael Paquier
b8401c32ba Fix some header identifications
The following header files missed the shot:
- jsonfuncs.h, as of ce0425b.
- jsonapi.h, as of beb4699.
- llvmjit_emit.h as of 7ec0d80.
- partdesc.h, as of 1bb5e78.

Author: Jesse Zhang
Discussion: https://postgr.es/m/CAGf+fX4-8xULEOz09DE2dZGjT+q8VJ--rqfTpvcFwc+A4fc-3Q@mail.gmail.com
2020-07-14 13:39:45 +09:00
Michael Paquier
9168793d72 Fix comments related to table AMs
Incorrect function names were referenced.  As this fixes some portions
of tableam.h, that is mentioned in the docs as something to look at when
implementing a table AM, backpatch down to 12 where this has been
introduced.

Author: Hironobu Suzuki
Discussion: https://postgr.es/m/8fe6d672-28dd-3f1d-7aed-ac2f6d599d3f@interdb.jp
Backpatch-through: 12
2020-07-14 13:17:11 +09:00
Jeff Davis
2302302236 HashAgg: before spilling tuples, set unneeded columns to NULL.
This is a replacement for 4cad2534. Instead of projecting all tuples
going into a HashAgg, only remove unnecessary attributes when actually
spilling. This avoids the regression for the in-memory case.

Discussion: https://postgr.es/m/a2fb7dfeb4f50aa0a123e42151ee3013933cb802.camel%40j-davis.com
Backpatch-through: 13
2020-07-12 22:59:32 -07:00
Amit Kapila
d973747281 Revert "Track statistics for spilling of changes from ReorderBuffer".
The stats with this commit was available only for WALSenders, however,
users might want to see for backends doing logical decoding via SQL API.
Then, users might want to reset and access these stats across server
restart which was not possible with the current patch.

List of commits reverted:

caa3c4242c   Don't call elog() while holding spinlock.
e641b2a995   Doc: Update the documentation for spilled transaction
statistics.
5883f5fe27   Fix unportable printf format introduced in commit 9290ad198.
9290ad198b   Track statistics for spilling of changes from ReorderBuffer.

Additionaly, remove the release notes entry for this feature.

Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/CA+fd4k5_pPAYRTDrO2PbtTOe0eHQpBvuqmCr8ic39uTNmR49Eg@mail.gmail.com
2020-07-13 08:53:23 +05:30
Michael Paquier
b1e48bbe64 Include replication origins in SQL functions for commit timestamp
This includes two changes:
- Addition of a new function pg_xact_commit_timestamp_origin() able, for
a given transaction ID, to return the commit timestamp and replication
origin of this transaction.  An equivalent function existed in
pglogical.
- Addition of the replication origin to pg_last_committed_xact().

The commit timestamp manager includes already APIs able to return the
replication origin of a transaction on top of its commit timestamp, but
the code paths for replication origins were never stressed as those
functions have never looked for a replication origin, and the SQL
functions available have never included this information since their
introduction in 73c986a.

While on it, refactor a test of modules/commit_ts/ to use tstzrange() to
check that a transaction timestamp is within the wanted range, making
the test a bit easier to read.

Bump catalog version.

Author: Movead Li
Reviewed-by: Madan Kumar, Michael Paquier
Discussion: https://postgr.es/m/2020051116430836450630@highgo.ca
2020-07-12 20:47:15 +09:00
Michael Paquier
cc35d8933a Rename field "relkind" to "objtype" for CTAS and ALTER TABLE nodes
"relkind" normally refers to the char field from pg_class.  However, in
the parse nodes AlterTableStmt and CreateTableAsStmt, "relkind" was used
for a field of type enum ObjectType, that could refer to other object
types than those possible for a relkind.  Such fields being usually
named "objtype", switch the name in both structures to make things more
consistent.  Note that this led to some confusion in functions that
also operate on a RangeTableEntry object, which also has a field named
"relkind".

This naming goes back to commit 09d4e96, where only OBJECT_TABLE and
OBJECT_INDEX were used.  This got extended later to use as well
OBJECT_TYPE with e440e12, not really a relation kind.

Author: Mark Dilger
Reviewed-by: Daniel Gustafsson, Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/609181AE-E399-47C7-9221-856E0F96BF93@enterprisedb.com
2020-07-11 13:32:28 +09:00
Andres Freund
a9a4a7ad56 code: replace most remaining uses of 'master'.
Author: Andres Freund
Reviewed-By: David Steele
Discussion: https://postgr.es/m/20200615182235.x7lch5n6kcjq4aue@alap3.anarazel.de
2020-07-08 13:24:35 -07:00
Andres Freund
e07633646a code: replace 'master' with 'leader' where appropriate.
Leader already is the more widely used terminology, but a few places
didn't get the message.

Author: Andres Freund
Reviewed-By: David Steele
Discussion: https://postgr.es/m/20200615182235.x7lch5n6kcjq4aue@alap3.anarazel.de
2020-07-08 12:58:32 -07:00
Andres Freund
5e7bbb5286 code: replace 'master' with 'primary' where appropriate.
Also changed "in the primary" to "on the primary", and added a few
"the" before "primary".

Author: Andres Freund
Reviewed-By: David Steele
Discussion: https://postgr.es/m/20200615182235.x7lch5n6kcjq4aue@alap3.anarazel.de
2020-07-08 12:57:23 -07:00
Tom Lane
f3faf35f37 Don't create pg_type entries for sequences or toast tables.
Commit f7f70d5e2 left one inconsistency behind: we're still creating
pg_type entries for the composite types of sequences and toast tables,
but not arrays over those composites.  But there seems precious little
reason to have named composite types for toast tables, and not much more
to have them for sequences (especially given the thought that sequences
may someday not be standalone relations at all).

So, let's close that inconsistency by removing these composite types,
rather than adding arrays for them.  This buys back a little bit of
the initial pg_type bloat added by the previous patch, and could be
a significant savings in a large database with many toast tables.

Aside from a small logic rearrangement in heap_create_with_catalog,
this patch mostly needs to clean up some places that were assuming that
pg_class.reltype always has a valid value.  Those are really pre-existing
bugs, given that it's documented otherwise; notably, the plpgsql changes
fix code that gives "cache lookup failed for type 0" on indexes today.
But none of these seem interesting enough to back-patch.

Also, remove the pg_dump/pg_upgrade infrastructure for propagating
a toast table's pg_type OID into the new database, since we no longer
need that.

Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
2020-07-07 15:43:22 -04:00
Alvaro Herrera
a8aaa0c786
Morph pg_replication_slots.min_safe_lsn to safe_wal_size
The previous definition of the column was almost universally disliked,
so provide this updated definition which is more useful for monitoring
purposes: a large positive value is good, while zero or a negative value
means danger.  This should be operationally more convenient.

Backpatch to 13, where the new column to pg_replication_slots (and the
feature it represents) were added.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Discussion: https://postgr.es/m/9ddfbf8c-2f67-904d-44ed-cf8bc5916228@oss.nttdata.com
2020-07-07 13:08:00 -04:00
Tom Lane
f7f70d5e22 Create composite array types for initdb-created relations.
When we invented arrays of composite types (commit bc8036fc6),
we excluded system catalogs, basically just on the grounds of not
wanting to bloat pg_type.  However, it's definitely inconsistent that
catalogs' composite types can't be put into arrays when others can.
Another problem is that the exclusion is done by checking
IsUnderPostmaster in heap_create_with_catalog, which means that

(1) If a user tries to create a table in single-user mode, it doesn't
get an array type.  That's bad in itself, plus it breaks pg_upgrade.

(2) If someone drops and recreates a system view or information_schema
view (as we occasionally recommend doing), it will now have an array
type where it did not before, making for still more inconsistency.

So this is all pretty messy.  Let's just get rid of the inconsistency
and decree that system-created relations should have array types if
similar user-created ones would, i.e. it only depends on the relkind.
As of HEAD, that means that the initial contents of pg_type grow from
411 rows to 605, which is a lot of growth percentage-wise, but it's
still quite a small catalog compared to others.

Wenjing Zeng, reviewed by Shawn Wang, further hacking by me

Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
2020-07-06 14:21:16 -04:00
Michael Paquier
aa38434824 Refactor routines for name lookups of procedures and operators
This introduces a new set of extended routines for procedure and
operator name lookups, with a flag bitmask argument that can modify the
result.  The following options are available:
- Force schema qualification, ignoring search_path.  This is similar to
the existing option for format_{operator|procedure}_qualified().
- Force NULL as result instead of a numeric OID for an undefined
object.  This option is new.

This is a refactoring similar to 1185c78, that will be used for a future
patch to improve the SQL functions providing information using object
addresses for undefined objects.

Author: Michael Paquier
Reviewed-by: Aleksander Alekseev, Dmitry Dolgov, Daniel Gustafsson,
Álvaro Herrera
Discussion: https://postgr.es/m/CAB7nPqSZxrSmdHK-rny7z8mi=EAFXJ5J-0RbzDw6aus=wB5azQ@mail.gmail.com
2020-07-06 13:06:08 +09:00
Michael Paquier
1185c78294 Add new flag to format_type_extended() to get NULL for undefined type
If a type scanned is undefined, type format routines have two behaviors
depending on if FORMAT_TYPE_ALLOW_INVALID is used by the caller or not:
- Issue a cache lookup error
- Return an undefined type name "???", "???[]" or "-"

The current interface is not really helpful for callers willing to
format properly a type name, but still make sure that the type is
defined as there could be types matching the strings generated when
looking for an undefined type, even if that should not be a problem in
practice.  In order to counter that, add a new flag called
FORMAT_TYPE_INVALID_AS_NULL that returns a NULL result instead of "???
or "-" which does not generate an error.  This flag will be used in a
follow-up patch improving the set of SQL functions showing information
for object addresses when it comes to undefined objects.

Author: Michael Paquier
Reviewed-by: Aleksander Alekseev, Dmitry Dolgov, Daniel Gustafsson,
Álvaro Herrera
Discussion: https://postgr.es/m/CAB7nPqSZxrSmdHK-rny7z8mi=EAFXJ5J-0RbzDw6aus=wB5azQ@mail.gmail.com
2020-07-06 12:12:11 +09:00
Peter Eisentraut
e61225ffab Rename enable_incrementalsort for clarity
Author: James Coleman <jtc331@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/df652910-e985-9547-152c-9d4357dc3979%402ndquadrant.com
2020-07-05 11:43:08 +02:00
Michael Paquier
641dd167a3 Move description of libpqwalreceiver hooks out of the replication's README
src/backend/replication/README includes since 32bc08b a basic
description of the WAL receiver hooks available in walreceiver.h for a
module like libpqwalreceiver, but the README has never been updated to
reflect changes done to the hooks, so the contents of the README have
rotten with the time.  This commit moves the description from the README
to walreceiver.h, where it will be hard to miss that a description
update or addition is needed depending on the modifications done to the
hooks.

Each hook now includes a description of what it does in walreceiver.h,
and the replication's README mentions walreceiver.h.

Thanks also to Amit Kapila for the discussion.

Author: Michael Paquier
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/20200502024606.GA471944@paquier.xyz
2020-07-02 13:57:03 +09:00
Fujii Masao
9bae7e4cde Add +(pg_lsn,numeric) and -(pg_lsn,numeric) operators.
By using these operators, the number of bytes can be added into and
subtracted from LSN.

Bump catalog version.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Michael Paquier, Asif Rehman
Discussion: https://postgr.es/m/ed9f7f74-e996-67f8-554a-52ebd3779b3b@oss.nttdata.com
2020-06-30 23:55:07 +09:00
Tom Lane
b63dd3d88f Add hints about protocol-version-related SSL connection failures.
OpenSSL's native reports about problems related to protocol version
restrictions are pretty opaque and inconsistent.  When we get an
SSL error that is plausibly due to this, emit a hint message that
includes the range of SSL protocol versions we (think we) are
allowing.  This should at least get the user thinking in the right
direction to resolve the problem, even if the hint isn't totally
accurate, which it might not be for assorted reasons.

Back-patch to v13 where we increased the default minimum protocol
version, thereby increasing the risk of this class of failure.

Patch by me, reviewed by Daniel Gustafsson

Discussion: https://postgr.es/m/a9408304-4381-a5af-d259-e55d349ae4ce@2ndquadrant.com
2020-06-27 12:47:58 -04:00
Peter Geoghegan
10f1ab2cb8 Fix misuse of table_index_fetch_tuple_check().
Commit 0d861bbb, which added deduplication to nbtree, had
_bt_check_unique() pass a TID to table_index_fetch_tuple_check() that
isn't safe to mutate.  table_index_fetch_tuple_check()'s tid argument is
modified when the TID in question is not the latest visible tuple in a
hot chain, though this wasn't documented.

To fix, go back to using a local copy of the TID in _bt_check_unique(),
and update comments above table_index_fetch_tuple_check().

Backpatch: 13-, where B-Tree deduplication was introduced.
2020-06-25 10:55:28 -07:00