Commit graph

11286 commits

Author SHA1 Message Date
Michael Paquier
943f7ae1c8 Add lookup table for replication slot conflict reasons
This commit switches the handling of the conflict cause strings for
replication slots to use a table rather than being explicitly listed,
using a C99-designated initializer syntax for the array elements.  This
makes the whole more readable while easing future maintenance with less
areas to update when adding a new conflict reason.

This is similar to 74a7306310, but the scale of the change is smaller
as there are less conflict causes than LWLock builtin tranche names.

Author: Bharath Rupireddy
Reviewed-by: Jelte Fennema-Nio
Discussion: https://postgr.es/m/CALj2ACUxSLA91QGFrJsWNKs58KXb1C03mbuwKmzqqmoAKLwJaw@mail.gmail.com
2024-02-22 08:40:40 +09:00
Heikki Linnakangas
28f3915b73 Remove superfluous 'pgprocno' field from PGPROC
It was always just the index of the PGPROC entry from the beginning of
the proc array. Introduce a macro to compute it from the pointer
instead.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/8171f1aa-496f-46a6-afc3-c46fe7a9b407@iki.fi
2024-02-22 01:21:34 +02:00
Peter Eisentraut
74563f6b90 Revert "Improve compression and storage support with inheritance"
This reverts commit 0413a55699.

pg_dump cannot currently dump all the structures that are allowed by
this patch.  This needs more work in pg_dump and more test coverage.

Discussion: https://www.postgresql.org/message-id/flat/24656cec-d6ef-4d15-8b5b-e8dfc9c833a7@eisentraut.org
2024-02-20 11:10:59 +01:00
Nathan Bossart
3b42bdb471 Use new overflow-safe integer comparison functions.
Commit 6b80394781 introduced integer comparison functions designed
to be as efficient as possible while avoiding overflow.  This
commit makes use of these functions in many of the in-tree qsort()
comparators to help ensure transitivity.  Many of these comparator
functions should also see a small performance boost.

Author: Mats Kindahl
Reviewed-by: Andres Freund, Fabrízio de Royes Mello
Discussion: https://postgr.es/m/CA%2B14426g2Wa9QuUpmakwPxXFWG_1FaY0AsApkvcTBy-YfS6uaw%40mail.gmail.com
2024-02-16 14:05:36 -06:00
Nathan Bossart
6b80394781 Introduce overflow-safe integer comparison functions.
This commit adds integer comparison functions that are designed to
be as efficient as possible while avoiding overflow.  A follow-up
commit will make use of these functions in many of the in-tree
qsort() comparators.  The new functions are not better in all cases
(e.g., when the comparator function is inlined), so it is important
to consider the context before using them.

Author: Mats Kindahl
Reviewed-by: Tom Lane, Heikki Linnakangas, Andres Freund, Thomas Munro, Andrey Borodin, Fabrízio de Royes Mello
Discussion: https://postgr.es/m/CA%2B14426g2Wa9QuUpmakwPxXFWG_1FaY0AsApkvcTBy-YfS6uaw%40mail.gmail.com
2024-02-16 13:37:02 -06:00
Nathan Bossart
5497daf3aa Replace calls to pg_qsort() with the qsort() macro.
Calls to this function might give the impression that pg_qsort()
is somehow different than qsort(), when in fact there is a qsort()
macro in port.h that expands all in-tree uses to pg_qsort().

Reviewed-by: Mats Kindahl
Discussion: https://postgr.es/m/CA%2B14426g2Wa9QuUpmakwPxXFWG_1FaY0AsApkvcTBy-YfS6uaw%40mail.gmail.com
2024-02-16 11:37:50 -06:00
Peter Eisentraut
0413a55699 Improve compression and storage support with inheritance
A child table can specify a compression or storage method different
from its parents.  This was previously an error.  (But this was
inconsistently enforced because for example the settings could be
changed later using ALTER TABLE.)  This now also allows an explicit
override if multiple parents have different compression or storage
settings, which was previously an error that could not be overridden.

The compression and storage properties remains unchanged in a child
inheriting from parent(s) after its creation, i.e., when using ALTER
TABLE ...  INHERIT.  (This is not changed.)

Before this change, the error detail would mention the first pair of
conflicting parent compression or storage methods.  But with this
change it waits till the child specification is considered by which
time we may have encountered many such conflicting pairs.  Hence the
error detail after this change does not include the conflicting
compression/storage methods.  Those can be obtained from parent
definitions if necessary.  The code to maintain list of all
conflicting methods or even the first conflicting pair does not seem
worth the convenience it offers.  This change is inline with what we
do with conflicting default values.

Before this commit, the specified storage method could be stored in
ColumnDef::storage (CREATE TABLE ... LIKE) or ColumnDef::storage_name
(CREATE TABLE ...).  This caused the MergeChildAttribute() and
MergeInheritedAttribute() to ignore a storage method specified in the
child definition since it looked only at ColumnDef::storage.  This
commit removes ColumnDef::storage and instead uses
ColumnDef::storage_name to save any storage method specification. This
is similar to how compression method specification is handled.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/24656cec-d6ef-4d15-8b5b-e8dfc9c833a7@eisentraut.org
2024-02-16 13:27:46 +01:00
Alexander Korotkov
51efe38cb9 Introduce transaction_timeout
This commit adds timeout that is expected to be used as a prevention
of long-running queries. Any session within the transaction will be
terminated after spanning longer than this timeout.

However, this timeout is not applied to prepared transactions.
Only transactions with user connections are affected.

Discussion: https://postgr.es/m/CAAhFRxiQsRs2Eq5kCo9nXE3HTugsAAJdSQSmxncivebAxdmBjQ%40mail.gmail.com
Author: Andrey Borodin <amborodin@acm.org>
Author: Japin Li <japinli@hotmail.com>
Author: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: bt23nguyent <bt23nguyent@oss.nttdata.com>
Reviewed-by: Yuhang Qiu <iamqyh@gmail.com>
2024-02-15 23:56:12 +02:00
Nathan Bossart
8fd0498de2 Remove obsolete check in SIGTERM handler for the startup process.
Thanks to commit 3b00fdba9f, this check in the SIGTERM handler for
the startup process is now obsolete and can be removed.  Instead of
leaving around the dead function write_stderr_signal_safe(), I've
opted to just remove it for now.

This partially reverts commit 97550c0711.

Reviewed-by: Andres Freund, Noah Misch
Discussion: https://postgr.es/m/20231121212008.GA3742740%40nathanxps13
2024-02-14 17:09:31 -06:00
Nathan Bossart
8d8afd48d3 Allow pg_monitor to execute pg_current_logfile().
We allow roles with privileges of pg_monitor to execute functions
like pg_ls_logdir(), so it seems natural that such roles would also
be able to execute this function.

Bumps catversion.

Co-authored-by: Pavlo Golub
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/CAK7ymcLmEYWyQkiCZ64WC-HCzXAB0omM%3DYpj9B3rXe8vUAFMqw%40mail.gmail.com
2024-02-14 11:48:29 -06:00
Amit Kapila
ddd5f4f54a Add a slot synchronization function.
This commit introduces a new SQL function pg_sync_replication_slots()
which is used to synchronize the logical replication slots from the
primary server to the physical standby so that logical replication can be
resumed after a failover or planned switchover.

A new 'synced' flag is introduced in pg_replication_slots view, indicating
whether the slot has been synchronized from the primary server. On a
standby, synced slots cannot be dropped or consumed, and any attempt to
perform logical decoding on them will result in an error.

The logical replication slots on the primary can be synchronized to the
hot standby by using the 'failover' parameter of
pg-create-logical-replication-slot(), or by using the 'failover' option of
CREATE SUBSCRIPTION during slot creation, and then calling
pg_sync_replication_slots() on standby. For the synchronization to work,
it is mandatory to have a physical replication slot between the primary
and the standby aka 'primary_slot_name' should be configured on the
standby, and 'hot_standby_feedback' must be enabled on the standby. It is
also necessary to specify a valid 'dbname' in the 'primary_conninfo'.

If a logical slot is invalidated on the primary, then that slot on the
standby is also invalidated.

If a logical slot on the primary is valid but is invalidated on the
standby, then that slot is dropped but will be recreated on the standby in
the next pg_sync_replication_slots() call provided the slot still exists
on the primary server. It is okay to recreate such slots as long as these
are not consumable on standby (which is the case currently). This
situation may occur due to the following reasons:
- The 'max_slot_wal_keep_size' on the standby is insufficient to retain
WAL records from the restart_lsn of the slot.
- 'primary_slot_name' is temporarily reset to null and the physical slot
is removed.

The slot synchronization status on the standby can be monitored using the
'synced' column of pg_replication_slots view.

A functionality to automatically synchronize slots by a background worker
and allow logical walsenders to wait for the physical will be done in
subsequent commits.

Author: Hou Zhijie, Shveta Malik, Ajin Cherian based on an earlier version by Peter Eisentraut
Reviewed-by: Masahiko Sawada, Bertrand Drouvot, Peter Smith, Dilip Kumar, Nisha Moond, Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2024-02-14 09:45:36 +05:30
Michael Paquier
06bd311bce Revert "Refactor CopyReadAttributes{CSV,Text}() to use a callback in COPY FROM"
This reverts commit 95fb5b4902, for reasons similar to what led to
1aa8324b81.  In this case, the callback was called once per row, which
is less worse than the previous callback introduced for COPY TO called
once per argument for each row, still the patch set discussed to plug in
custom routines to the COPY paths would be able to know which subroutine
to use depending on its CopyFromState, so this led to a suboptimal
approach at the end.

For now, this part is reverted to consider better which approach to use.

Discussion: https://postgr.es/m/20240206014125.qofww7ew3dx3v3uk@awork3.anarazel.de
2024-02-14 10:07:22 +09:00
Jeff Davis
91f2cae7a4 Read WAL directly from WAL buffers.
If available, read directly from WAL buffers, avoiding the need to go
through the filesystem. Only for physical replication for now, but can
be expanded to other callers.

In preparation for replicating unflushed WAL data.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACXKKK%3DwbiG5_t6dGao5GoecMwRkhr7GjVBM_jg54%2BNa%3DQ%40mail.gmail.com
Reviewed-by: Andres Freund, Alvaro Herrera, Nathan Bossart, Dilip Kumar, Nitin Jadhav, Melih Mutlu, Kyotaro Horiguchi
2024-02-12 11:11:22 -08:00
Thomas Munro
65f438471b Fix gai_strerror() thread-safety on Windows.
Commit 5579388d removed code that supplied a fallback implementation of
getaddrinfo(), which was dead code on modern systems.  One tiny piece of
the removed code was still doing something useful on Windows, though:
that OS's own gai_strerror()/gai_strerrorA() function returns a pointer
to a static buffer that it overwrites each time, so it's not
thread-safe.  In rare circumstances, a multi-threaded client program
could get an incorrect or corrupted error message.

Restore the replacement gai_strerror() function, though now that it's
only for Windows we can put it into a win32-specific file and cut it
down to the errors that Windows documents.  The error messages here are
taken from FreeBSD, because Windows' own messages seemed too verbose.

Back-patch to 16.

Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKz%2BF9d2PTiXwfYV7qJw%2BWg2jzACgSDgPizUw7UG%3Di58A%40mail.gmail.com
2024-02-12 11:14:21 +13:00
Daniel Gustafsson
5c7038d70b Refactor pipe_read_line to return the full line
Commit 5b2f4afffe refactored find_other_exec() and in the process
created pipe_read_line() into a static routine for reading a single
line of output, aimed at reading version numbers.  Commit a7e8ece41
later exposed it externally in order to read a postgresql.conf GUC
using "postgres -C ..".  Further, f06b1c598 also made use of it for
reading a version string much like find_other_exec().  The internal
variable remained "pgver", even when used for other purposes.

Since the function requires passing a buffer and its size, and at
most size - 1 bytes will be read via fgets(), there is a truncation
risk when using this for reading GUCs (like how pg_rewind does,
though the risk in this case is marginal).

To keep this as generic functionality for reading a line from a pipe,
this refactors pipe_read_line() into returning an allocated buffer
containing all of the line to remove the risk of silent truncation.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/DEDF73CE-D528-49A3-9089-B3592FD671A9@yesql.se
2024-02-09 15:03:16 +01:00
John Naylor
2579985086 Fix warnings in cpluspluscheck
Various int variables were compared to macros that are of type size_t,
which caused -Wsign-compare warnings in cpluspluscheck.  Change those
to size_t, which also better describes their purpose.

Per report from Peter Eisentraut

Discussion: https://postgr.es/m/486847dc-6de5-464a-938e-bac98ec2438b%40eisentraut.org
2024-02-08 10:07:26 +07:00
Alvaro Herrera
d172b717c6
Use atomic access for SlruShared->latest_page_number
The new concurrency model proposed for slru.c to improve performance
does not include any single lock that would coordinate processes
doing concurrent reads/writes on SlruShared->latest_page_number.
We can instead use atomic reads and writes for that variable.

Author: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAFiTN-vzDvNz=ExGXz6gdyjtzGixKSqs0mKHMmaQ8sOSEFZ33A@mail.gmail.com
2024-02-06 10:54:10 +01:00
John Naylor
b83033c3cf Further cosmetic review of hashfn_unstable.h
In follow-up to e97b672c8,
* Flesh out comments explaining the incremental interface
* Clarify detection of zero bytes when hashing aligned C strings

The latter was suggested and reviewed by Jeff Davis

Discussion: https://postgr.es/m/48e8f8bbe0be9c789f98776c7438244ab7a7cc63.camel%40j-davis.com
2024-02-06 14:49:06 +07:00
John Naylor
9ed3ee5001 Simplify initialization of incremental hash state
The standalone functions fasthash{32,64} use length for two purposes:
how many bytes to hash, and how to perturb the internal seed.

Developers using the incremental interface may not know the length
ahead of time (e.g. for C strings). In this case, it's advised to
pass length to the finalizer, but initialization still needed some
length up front, in the form of a placeholder macro.

Separate the concerns by having the standalone functions perturb the
internal seed themselves from their own length parameter, allowing
to remove "len" from fasthash_init(), as well as the placeholder macro.

Discussion: https://postgr.es/m/CANWCAZbTUk2LOyhsFo33gjLyLAHZ7ucXCi5K9u%3D%2BPtnTShDKtw%40mail.gmail.com
2024-02-06 14:39:36 +07:00
Peter Eisentraut
1ae5ace755 Fix meson installation of new generated files
Fix for 9b1a6f50b9: We want to install catalog/syscache_ids.h but not
catalog/syscache_info.h.  The meson code has this backwards.  The
makefiles are ok.

Reported-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://www.postgresql.org/message-id/CAJ7c6TMDGmAiozDjJQ3%3DP3cd-1BidC_GpitcAuU0aqq-r1eSoQ%40mail.gmail.com
2024-02-05 15:45:29 +01:00
Amit Kapila
dafbfed9ef Enhance libpqrcv APIs to support slot synchronization.
This patch provides support for regular (non-replication) connections in
libpqrcv_connect(). This can be used to execute SQL statements on the
primary server without starting a walsender.

A new API libpqrcv_get_dbname_from_conninfo() is also added to extract the
database name from the given connection-info.

Note that this patch doesn't change any existing functionality but later
patches implementing the slot synchronization will use this functionality
to connect to the primary server to fetch required slot information.

Author: Shveta Malik, Hou Zhijie, Ajin Cherian
Reviewed-by: Peter Smith, Bertrand Drouvot, Dilip Kumar, Masahiko Sawada, Nisha Moond, Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2024-02-05 10:54:06 +05:30
Michael Paquier
95fb5b4902 Refactor CopyReadAttributes{CSV,Text}() to use a callback in COPY FROM
CopyReadAttributes{CSV,Text}() are used to parse lines for text and CSV
format.  This reduces the number of "if" branches that need to be
checked when parsing fields in CSV and text mode when dealing with a
COPY FROM, something that can become more noticeable with more
attributes and more lines to process.

Extracted from a larger patch by the same author.

Author: Sutou Kouhei
Discussion: https://postgr.es/m/20231204.153548.2126325458835528809.kou@clear-code.com
2024-02-05 09:46:02 +09:00
Heikki Linnakangas
21d9c3ee4e Give SMgrRelation pointers a well-defined lifetime.
After calling smgropen(), it was not clear how long you could continue
to use the result, because various code paths including cache
invalidation could call smgrclose(), which freed the memory.

Guarantee that the object won't be destroyed until the end of the
current transaction, or in recovery, the commit/abort record that
destroys the underlying storage.

smgrclose() is now just an alias for smgrrelease(). It closes files
and forgets all state except the rlocator, but keeps the SMgrRelation
object valid.

A new smgrdestroy() function is used by rare places that know there
should be no other references to the SMgrRelation.

The short version:

 * smgrclose() is now just an alias for smgrrelease(). It releases
   resources, but doesn't destroy until EOX
 * smgrdestroy() now frees memory, and should rarely be used.

Existing code should be unaffected, but it is now possible for code that
has an SMgrRelation object to use it repeatedly during a transaction as
long as the storage hasn't been physically dropped.  Such code would
normally hold a lock on the relation.

This also replaces the "ownership" mechanism of SMgrRelations with a
pin counter.  An SMgrRelation can now be "pinned", which prevents it
from being destroyed at end of transaction.  There can be multiple pins
on the same SMgrRelation.  In practice, the pin mechanism is only used
by the relcache, so there cannot be more than one pin on the same
SMgrRelation.  Except with swap_relation_files XXX

Author: Thomas Munro, Heikki Linnakangas
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGJ8NTvqLHz6dqbQnt2c8XCki4r2QvXjBQcXpVwxTY_pvA@mail.gmail.com
2024-01-31 12:31:02 +02:00
Amit Kapila
776621a5e4 Add a failover option to subscriptions.
This commit introduces a new subscription option named 'failover', which
provides users with the ability to set the failover property of the
replication slot on the publisher when creating or altering a
subscription.

This uses the replication commands introduced by commit 7329240437 to
enable the failover option for a logical replication slot.

If the failover option is set to true, the associated replication slots
(i.e. the main slot and the table sync slots) in the upstream database are
enabled to be synchronized to the standbys. Note that the capability to
sync the replication slots will be added in subsequent commits.

Thanks to Masahiko Sawada for the design inputs.

Author: Shveta Malik, Hou Zhijie, Ajin Cherian
Reviewed-by: Peter Smith, Bertrand Drouvot, Dilip Kumar, Masahiko Sawada, Nisha Moond, Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2024-01-30 16:49:28 +05:30
Nathan Bossart
97287bdfae Move is_valid_ascii() to ascii.h.
This function requires simd.h, which is a rather large dependency
for a widely-used header file like pg_wchar.h.  Furthermore, there
is a report of a third-party tool that is struggling to use
pg_wchar.h due to its dependence on simd.h (presumably because
simd.h uses several intrinsics).  Moving the function to the much
less popular ascii.h resolves these issues for now.

This commit is back-patched for the benefit of the aforementioned
third-party tool.  The simd.h dependency was only added in v16,
but we've opted to back-patch to v15 so that is_valid_ascii() lives
in the same file for all versions where it exists.  This could
break existing third-party code that uses the function, but we
couldn't find any examples of such code.  It should be possible to
fix any code that this commit breaks by including ascii.h in the
file that uses is_valid_ascii().

Author: Jubilee Young
Reviewed-by: Tom Lane, John Naylor, Andres Freund, Eric Ridge
Discussion: https://postgr.es/m/CAPNHn3oKJJxMsYq%2BqLYzVJOFrUcOr4OF1EC-KtFT-qh8nOOOtQ%40mail.gmail.com
Backpatch-through: 15
2024-01-29 12:08:57 -06:00
Alvaro Herrera
5de890e361
Add EXPLAIN (MEMORY) to report planner memory consumption
This adds a new "Memory:" line under the "Planning:" group (which
currently only has "Buffers:") when the MEMORY option is specified.

In order to make the reporting reasonably accurate, we create a separate
memory context for planner activities, to be used only when this option
is given.  The total amount of memory allocated by that context is
reported as "allocated"; we subtract memory in the context's freelists
from that and report that result as "used".  We use
MemoryContextStatsInternal() to obtain the quantities.

The code structure to show buffer usage during planning was not in
amazing shape, so I (Álvaro) modified the patch a bit to clean that up
in passing.

Author: Ashutosh Bapat
Reviewed-by: David Rowley, Andrey Lepikhov, Jian He, Andy Fan
Discussion: https://www.postgresql.org/message-id/CAExHW5sZA=5LJ_ZPpRO-w09ck8z9p7eaYAqq3Ks9GDfhrxeWBw@mail.gmail.com
2024-01-29 17:53:03 +01:00
Amit Kapila
7329240437 Allow setting failover property in the replication command.
This commit implements a new replication command called
ALTER_REPLICATION_SLOT and a corresponding walreceiver API function named
walrcv_alter_slot. Additionally, the CREATE_REPLICATION_SLOT command has
been extended to support the failover option.

These new additions allow the modification of the failover property of a
replication slot on the publisher. A subsequent commit will make use of
these commands in subscription commands and will add the tests as well to
cover the functionality added/changed by this commit.

Author: Hou Zhijie, Shveta Malik
Reviewed-by: Peter Smith, Bertrand Drouvot, Dilip Kumar, Masahiko Sawada, Nisha Moond, Kuroda, Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2024-01-29 09:37:23 +05:30
Masahiko Sawada
08e6344fd6 Remove ReorderBufferTupleBuf structure.
Since commit a4ccc1cef, the 'node' and 'alloc_tuple_size' fields of
the ReorderBufferTupleBuf structure are no longer used. This leaves
only the 'tuple' field in the structure. Since keeping a single-field
structure makes little sense, the ReorderBufferTupleBuf is removed
entirely. The code is refactored accordingly.

No back-patching since these are ABI changes in an exposed structure
and functions, and there would be some risk of breaking extensions.

Author: Aleksander Alekseev
Reviewed-by: Amit Kapila, Masahiko Sawada, Reid Thompson
Discussion: https://postgr.es/m/CAD21AoCvnuxiXXfRecp7g9+CeC35POQfhuQeJFr7_9u_Q5jc_Q@mail.gmail.com
2024-01-29 10:37:16 +09:00
Tom Lane
8ba6fdf905 Support TZ and OF format codes in to_timestamp().
Formerly, these were only supported in to_char(), but there seems
little reason for that restriction.  We should at least have enough
support to permit round-tripping the output of to_char().

In that spirit, TZ accepts either zone abbreviations or numeric
(HH or HH:MM) offsets, which are the cases that to_char() can output.
In an ideal world we'd make it take full zone names too, but
that seems like it'd introduce an unreasonable amount of ambiguity,
since the rules for POSIX-spec zone names are so lax.

OF is a subset of this, accepting only HH or HH:MM.

One small benefit of this improvement is that we can simplify
jsonpath's executeDateTimeMethod function, which no longer needs
to consider the HH and HH:MM cases separately.  Moreover, letting
it accept zone abbreviations means it will accept "Z" to mean UTC,
which is emitted by JSON.stringify() for example.

Patch by me, reviewed by Aleksander Alekseev and Daniel Gustafsson

Discussion: https://postgr.es/m/1681086.1686673242@sss.pgh.pa.us
2024-01-25 17:47:08 -05:00
Andrew Dunstan
66ea94e8e6 Implement various jsonpath methods
This commit implements ithe jsonpath .bigint(), .boolean(),
.date(), .decimal([precision [, scale]]), .integer(), .number(),
.string(), .time(), .time_tz(), .timestamp(), and .timestamp_tz()
methods.

.bigint() converts the given JSON string or a numeric value to
the bigint type representation.

.boolean() converts the given JSON string, numeric, or boolean
value to the boolean type representation.  In the numeric case, only
integers are allowed. We use the parse_bool() backend function
to convert a string to a bool.

.decimal([precision [, scale]]) converts the given JSON string
or a numeric value to the numeric type representation.  If precision
and scale are provided for .decimal(), then it is converted to the
equivalent numeric typmod and applied to the numeric number.

.integer() and .number() convert the given JSON string or a
numeric value to the int4 and numeric type representation.

.string() uses the datatype's output function to convert numeric
and various date/time types to the string representation.

The JSON string representing a valid date/time is converted to the
specific date or time type representation using jsonpath .date(),
.time(), .time_tz(), .timestamp(), .timestamp_tz() methods.  The
changes use the infrastructure of the .datetime() method and perform
the datatype conversion as appropriate.  Unlike the .datetime()
method, none of these methods accept a format template and use ISO
DateTime format instead.  However, except for .date(), the
date/time related methods take an optional precision to adjust the
fractional seconds.

Jeevan Chalke, reviewed by Peter Eisentraut and Andrew Dunstan.
2024-01-25 10:15:43 -05:00
Peter Eisentraut
924d046dcf Add a const decoration
Useful for a subsequent patch.

Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da@eisentraut.org
2024-01-25 13:34:49 +01:00
Alvaro Herrera
55627ba2d3
Remove dummy_spinlock
It's been unused since 1b468a131b (2015).
2024-01-25 11:43:47 +01:00
Amit Kapila
c393308b69 Allow to enable failover property for replication slots via SQL API.
This commit adds the failover property to the replication slot. The
failover property indicates whether the slot will be synced to the standby
servers, enabling the resumption of corresponding logical replication
after failover. But note that this commit does not yet include the
capability to sync the replication slot; the subsequent commits will add
that capability.

A new optional parameter 'failover' is added to the
pg_create_logical_replication_slot() function. We will also enable to set
'failover' option for slots via the subscription commands in the
subsequent commits.

The value of the 'failover' flag is displayed as part of
pg_replication_slots view.

Author: Hou Zhijie, Shveta Malik, Ajin Cherian
Reviewed-by: Peter Smith, Bertrand Drouvot, Dilip Kumar, Masahiko Sawada, Nisha Moond, Kuroda, Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2024-01-25 12:15:46 +05:30
Thomas Munro
820b5af73d jit: Require at least LLVM 10.
Remove support for older LLVM versions.  The default on common software
distributions will be at least LLVM 10 when PostgreSQL 17 ships.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CA%2BhUKGLhNs5geZaVNj2EJ79Dx9W8fyWUU3HxcpZy55sMGcY%3DiA%40mail.gmail.com
2024-01-25 15:42:34 +13:00
Masahiko Sawada
729439607a Add progress reporting of skipped tuples during COPY FROM.
9e2d870119 enabled the COPY command to skip malformed data, however
there was no visibility into how many tuples were actually skipped
during the COPY FROM.

This commit adds a new "tuples_skipped" column to
pg_stat_progress_copy view to report the number of tuples that were
skipped because they contain malformed data.

Bump catalog version.

Author: Atsushi Torikoshi
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/d12fd8c99adcae2744212cb23feff6ed%40oss.nttdata.com
2024-01-25 10:57:41 +09:00
Peter Eisentraut
46a0cd4cef Add temporal PRIMARY KEY and UNIQUE constraints
Add WITHOUT OVERLAPS clause to PRIMARY KEY and UNIQUE constraints.
These are backed by GiST indexes instead of B-tree indexes, since they
are essentially exclusion constraints with = for the scalar parts of
the key and && for the temporal part.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-01-24 16:34:37 +01:00
Amit Langote
1edb3b491b Adjust populate_record_field() to handle errors softly
This adds a Node *escontext parameter to it and a bunch of functions
downstream to it, replacing any ereport()s in that path by either
errsave() or ereturn() as appropriate.  This also adds code to those
functions where necessary to return early upon encountering a soft
error.

The changes here are mainly intended to suppress errors in the
functions of jsonfuncs.c.  Functions in any external modules, such as
arrayfuncs.c, that those functions may in turn call are not changed
here based on the assumption that the various checks in jsonfuncs.c
functions should ensure that only values that are structurally valid
get passed to the functions in those external modules.  An exception
is made for domain_check() to allow handling domain constraint
violation errors softly.

For testing, this adds a function jsonb_populate_record_valid(),
which returns true if jsonb_populate_record() would finish without
causing an error for the provided JSON object, false otherwise.  Note
that jsonb_populate_record() internally calls populate_record(),
which in turn uses populate_record_field().

Extracted from a much larger patch to add SQL/JSON query functions.

Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Andrew Dunstan <andrew@dunslane.net>
Author: Amit Langote <amitlangote09@gmail.com>

Reviewers have included (in no particular order) Andres Freund,
Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers,
Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby,
Álvaro Herrera, Jian He, Peter Eisentraut

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com
Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-01-24 15:04:33 +09:00
Amit Langote
aaaf9449ec Add soft error handling to some expression nodes
This adjusts the code for CoerceViaIO and CoerceToDomain expression
nodes to handle errors softly.

For CoerceViaIo, this adds a new ExprEvalStep opcode
EEOP_IOCOERCE_SAFE, which is implemented in the new accompanying
function ExecEvalCoerceViaIOSafe().  The only difference from
EEOP_IOCOERCE's inline implementation is that the input function
receives an ErrorSaveContext via the function's
FunctionCallInfo.context, which it can use to handle errors softly.

For CoerceToDomain, this simply entails replacing the ereport() in
ExecEvalConstraintNotNull() and ExecEvalConstraintCheck() by
errsave() passing it the ErrorSaveContext passed in the expression's
ExprEvalStep.

In both cases, the ErrorSaveContext to be used is passed by setting
ExprState.escontext to point to it before calling ExecInitExprRec()
on the expression tree whose errors are to be handled softly.

Note that there's no functional change as of this commit as no call
site of ExecInitExprRec() has been changed.  This is intended for
implementing new SQL/JSON expression nodes in future commits.

Extracted from a much larger patch to add SQL/JSON query functions.

Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
Author: Teodor Sigaev <teodor@sigaev.ru>
Author: Oleg Bartunov <obartunov@gmail.com>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Andrew Dunstan <andrew@dunslane.net>
Author: Amit Langote <amitlangote09@gmail.com>

Reviewers have included (in no particular order) Andres Freund,
Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers,
Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby,
Álvaro Herrera, Jian He, Peter Eisentraut

Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de
Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com
Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-01-24 15:04:33 +09:00
Peter Eisentraut
6eb6086faa Fix makefiles for newly added files
The new files added in 9b1a6f50b9 need to be mentioned in a few more
places in the makefiles.

Discussion: https://www.postgresql.org/message-id/flat/E1rSAY2-002hIk-4y%40gemulon.postgresql.org
2024-01-23 16:33:26 +01:00
Peter Eisentraut
9b1a6f50b9 Generate syscache info from catalog files
Add a new genbki macros MAKE_SYSCACHE that specifies the syscache ID
macro, the underlying index, and the number of buckets.  From that, we
can generate the existing tables in syscache.h and syscache.c via
genbki.pl.

Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/75ae5875-3abc-dafc-8aec-73247ed41cde@eisentraut.org
2024-01-23 07:31:06 +01:00
David Rowley
b262ad440e Add better handling of redundant IS [NOT] NULL quals
Until now PostgreSQL has not been very smart about optimizing away IS
NOT NULL base quals on columns defined as NOT NULL.  The evaluation of
these needless quals adds overhead.  Ordinarily, anyone who came
complaining about that would likely just have been told to not include
the qual in their query if it's not required.  However, a recent bug
report indicates this might not always be possible.

Bug 17540 highlighted that when we optimize Min/Max aggregates the IS NOT
NULL qual that the planner adds to make the rewritten plan ignore NULLs
can cause issues with poor index choice.  That particular case
demonstrated that other quals, especially ones where no statistics are
available to allow the planner a chance at estimating an approximate
selectivity for can result in poor index choice due to cheap startup paths
being prefered with LIMIT 1.

Here we take generic approach to fixing this by having the planner check
for NOT NULL columns and just have the planner remove these quals (when
they're not needed) for all queries, not just when optimizing Min/Max
aggregates.

Additionally, here we also detect IS NULL quals on a NOT NULL column and
transform that into a gating qual so that we don't have to perform the
scan at all.  This also works for join relations when the Var is not
nullable by any outer join.

This also helps with the self-join removal work as it must replace
strict join quals with IS NOT NULL quals to ensure equivalence with the
original query.

Author: David Rowley, Richard Guo, Andy Fan
Reviewed-by: Richard Guo, David Rowley
Discussion: https://postgr.es/m/CAApHDvqg6XZDhYRPz0zgOcevSMo0d3vxA9DvHrZtKfqO30WTnw@mail.gmail.com
Discussion: https://postgr.es/m/17540-7aa1855ad5ec18b4%40postgresql.org
2024-01-23 18:09:18 +13:00
Michael Paquier
d86d20f0ba Add backend support for injection points
Injection points are a new facility that makes possible for developers
to run custom code in pre-defined code paths.  Its goal is to provide
ways to design and run advanced tests, for cases like:
- Race conditions, where processes need to do actions in a controlled
ordered manner.
- Forcing a state, like an ERROR, FATAL or even PANIC for OOM, to force
recovery, etc.
- Arbitrary sleeps.

This implements some basics, and there are plans to extend it more in
the future depending on what's required.  Hence, this commit adds a set
of routines in the backend that allows developers to attach, detach and
run injection points:
- A code path calling an injection point can be declared with the macro
INJECTION_POINT(name).
- InjectionPointAttach() and InjectionPointDetach() to respectively
attach and detach a callback to/from an injection point.  An injection
point name is registered in a shmem hash table with a library name and a
function name, which will be used to load the callback attached to an
injection point when its code path is run.

Injection point names are just strings, so as an injection point can be
declared and run by out-of-core extensions and modules, with callbacks
defined in external libraries.

This facility is hidden behind a dedicated switch for ./configure and
meson, disabled by default.

Note that backends use a local cache to store callbacks already loaded,
cleaning up their cache if a callback has found to be removed on a
best-effort basis.  This could be refined further but any tests but what
we have here was fine with the tests I've written while implementing
these backend APIs.

Author: Michael Paquier, with doc suggestions from Ashutosh Bapat.
Reviewed-by: Ashutosh Bapat, Nathan Bossart, Álvaro Herrera, Dilip
Kumar, Amul Sul, Nazir Bilal Yavuz
Discussion: https://postgr.es/m/ZTiV8tn_MIb_H2rE@paquier.xyz
2024-01-22 10:15:50 +09:00
Alexander Korotkov
0452b461bc Explore alternative orderings of group-by pathkeys during optimization.
When evaluating a query with a multi-column GROUP BY clause, we can minimize
sort operations or avoid them if we synchronize the order of GROUP BY clauses
with the ORDER BY sort clause or sort order, which comes from the underlying
query tree. Grouping does not imply any ordering, so we can compare
the keys in arbitrary order, and a Hash Agg leverages this. But for Group Agg,
we simply compared keys in the order specified in the query. This commit
explores alternative ordering of the keys, trying to find a cheaper one.

The ordering of group keys may interact with other parts of the query, some of
which may not be known while planning the grouping. For example, there may be
an explicit ORDER BY clause or some other ordering-dependent operation higher up
in the query, and using the same ordering may allow using either incremental
sort or even eliminating the sort entirely.

The patch always keeps the ordering specified in the query, assuming the user
might have additional insights.

This introduces a new GUC enable_group_by_reordering so that the optimization
may be disabled if needed.

Discussion: https://postgr.es/m/7c79e6a5-8597-74e8-0671-1c39d124c9d6%40sigaev.ru
Author: Andrei Lepikhov, Teodor Sigaev
Reviewed-by: Tomas Vondra, Claudio Freire, Gavin Flower, Dmitry Dolgov
Reviewed-by: Robert Haas, Pavel Borisov, David Rowley, Zhihong Yu
Reviewed-by: Tom Lane, Alexander Korotkov, Richard Guo, Alena Rybakina
2024-01-21 22:21:36 +02:00
Tom Lane
075df6b208 Add planner support functions for range operators <@ and @>.
These support functions will transform expressions with constant
range values into direct comparisons on the range bound values,
which are frequently better-optimizable.  The transformation is
skipped however if it would require double evaluation of a
volatile or expensive element expression.

Along the way, add the range opfamily OID to range typcache entries,
since load_rangetype_info has to compute that anyway and it seems
silly to duplicate the work later.

Kim Johan Andersson and Jian He, reviewed by Laurenz Albe

Discussion: https://postgr.es/m/94f64d1f-b8c0-b0c5-98bc-0793a34e0851@kimmet.dk
2024-01-20 13:57:54 -05:00
Nathan Bossart
8b2bcf3f28 Introduce the dynamic shared memory registry.
Presently, the most straightforward way for a shared library to use
shared memory is to request it at server startup via a
shmem_request_hook, which requires specifying the library in
shared_preload_libraries.  Alternatively, the library can create a
dynamic shared memory (DSM) segment, but absent a shared location
to store the segment's handle, other backends cannot use it.  This
commit introduces a registry for DSM segments so that these other
backends can look up existing segments with a library-specified
string.  This allows libraries to easily use shared memory without
needing to request it at server startup.

The registry is accessed via the new GetNamedDSMSegment() function.
This function handles allocating the segment and initializing it
via a provided callback.  If another backend already created and
initialized the segment, it simply attaches the segment.
GetNamedDSMSegment() locks the registry appropriately to ensure
that only one backend initializes the segment and that all other
backends just attach it.

The registry itself is comprised of a dshash table that stores the
DSM segment handles keyed by a library-specified string.

Reviewed-by: Michael Paquier, Andrei Lepikhov, Nikita Malakhov, Robert Haas, Bharath Rupireddy, Zhang Mingli, Amul Sul
Discussion: https://postgr.es/m/20231205034647.GA2705267%40nathanxps13
2024-01-19 14:24:36 -06:00
Peter Eisentraut
6db4598fcb Add stratnum GiST support function
This is support function 12 for the GiST AM and translates
"well-known" RT*StrategyNumber values into whatever strategy number is
used by the opclass (since no particular numbers are actually
required).  We will use this to support temporal PRIMARY
KEY/UNIQUE/FOREIGN KEY/FOR PORTION OF functionality.

This commit adds two implementations, one for internal GiST opclasses
(just an identity function) and another for btree_gist opclasses.  It
updates btree_gist from 1.7 to 1.8, adding the support function for
all its opclasses.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-01-19 15:42:13 +01:00
Alexander Korotkov
b725b7eec4 Rename COPY option from SAVE_ERROR_TO to ON_ERROR
The option names now are "stop" (default) and "ignore".  The future options
could be "file 'filename.log'" and "table 'tablename'".

Discussion: https://postgr.es/m/20240117.164859.2242646601795501168.horikyota.ntt%40gmail.com
Author: Jian He
Reviewed-by: Atsushi Torikoshi
2024-01-19 15:15:51 +02:00
John Naylor
dd0a0cfc81 Fixed misspelled byteswap function for big endian machines
Per members lora and mamba
2024-01-19 13:26:18 +07:00
John Naylor
0aba255440 Add optimized C string hashing
Given an already-initialized hash state and a NUL-terminated string,
accumulate the hash of the string into the hash state and return the
length for the caller to (optionally) save for the finalizer. This
avoids a strlen call.

If the string pointer is aligned, we can use a word-at-a-time
algorithm for NUL lookahead. The aligned case is only used on 64-bit
platforms, since it's not worth the extra complexity for 32-bit.

Handling the tail of the string after finishing the word-wise loop
was inspired by NetBSD's strlen(), but no code was taken since that
is written in assembly language.

As demonstration, use this in the search path cache. This brings the
general case performance closer to the special case optimization done
in commit a86c61c9ee. There are other places that could benefit, but
that is left for future work.

Jeff Davis and John Naylor
Reviewed by Heikki Linnakangas, Jian He, Junwang Zhao

Discussion: https://postgr.es/m/3820f030fd008ff14134b3e9ce5cc6dd623ed479.camel%40j-davis.com
Discussion: https://postgr.es/m/b40292c99e623defe5eadedab1d438cf51a4107c.camel%40j-davis.com
2024-01-19 12:56:15 +07:00
John Naylor
e97b672c88 Add inline incremental hash functions for in-memory use
It can be useful for a hash function to expose separate initialization,
accumulation, and finalization steps.  In particular, this is useful
for building inline hash functions for simplehash.  Instead of trying
to whack around hash_bytes while maintaining its current behavior on
all platforms, we base this work on fasthash (MIT licensed) which
is simple, faster than hash_bytes for inputs over 12 bytes long,
and also passes the hash function testing suite SMHasher.

The fasthash functions have been reimplemented using our added-on
incremental interface to validate that this method will still give
the same answer, provided we have the input length ahead of time.

This functionality lives in a new header hashfn_unstable.h. The name
implies we have the freedom to change things across versions that
would be unacceptable for our other hash functions that are used for
e.g. hash indexes and hash partitioning. As such, these should only
be used for in-memory data structures like hash tables. There is also
no guarantee of being independent of endianness or pointer size.

As demonstration, use fasthash for pgstat_hash_hash_key.  Previously
this called the 32-bit murmur finalizer on the three elements,
then joined them with hash_combine(). The new function is simpler,
faster and takes up less binary space. While the collision and bias
behavior were almost certainly fine with the previous coding, now we
have objective confidence of that.

There are other places that could benefit from this, but that is left
for future work.

Reviewed by Jeff Davis, Heikki Linnakangas, Jian He, Junwang Zhao
Credit to Andres Freund for the idea

Discussion: https://postgr.es/m/20231122223432.lywt4yz2bn7tlp27%40awork3.anarazel.de
2024-01-19 12:44:09 +07:00