Commit graph

7417 commits

Author SHA1 Message Date
Daniel Gustafsson
e889422d98 pg_amcheck: PQclear query results
While the potential memory leak is small, ensure to PQclear the query
results before disconnecting.

Author: Jiao Shuntian <312199339@qq.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/tencent_F34922C91C41E76C734773E767C9FBDB9906@qq.com
2025-02-24 16:03:19 +01:00
Jeff Davis
cb45dc3afb Documentation fixups for dumping statistics.
Reported-by: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>
Reported-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/OSCPR01MB149665630030E7F54FDA8B27BF5C72@OSCPR01MB14966.jpnprd01.prod.outlook.com
Discussion: https://postgr.es/m/25d26774-25fa-46f2-9888-c6a707d1fef7@dunslane.net
2025-02-22 10:03:11 -08:00
Álvaro Herrera
bba2fbc623
Change \conninfo to use tabular format
(Initially the proposal was to keep \conninfo alone and add this feature
as \conninfo+, but we decided against keeping the original.)

Also display more fields than before, though not as many as were
suggested during the discussion.  In particular, we don't show 'role'
nor 'session authorization', for both which a case can probably be made.
These can be added as followup commits, if we agree to it.

Some (most?) reviewers actually reviewed rather different versions of
the patch and do not necessarily endorse the current one.

Co-authored-by: Maiquel Grassi <grassi@hotmail.com.br>
Co-authored-by: Hunaid Sohail <hunaidpgml@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Sami Imseih <simseih@amazon.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Pavel Luzanov <p.luzanov@postgrespro.ru>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Erik Wienhold <ewie@ewie.name>
Discussion: https://postgr.es/m/CP8P284MB24965CB63DAC00FC0EA4A475EC462@CP8P284MB2496.BRAP284.PROD.OUTLOOK.COM
2025-02-22 10:05:26 +01:00
Masahiko Sawada
78d3f48895 Add test 005_char_signedness.pl to meson.build.
Oversight in a8238f87f9 where the test has been added.

Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
2025-02-21 12:31:16 -08:00
Tom Lane
29d75b25b5 Fix pg_dumpall to cope with dangling OIDs in pg_auth_members.
There is a race condition between "GRANT role" and "DROP ROLE",
which allows GRANT to install pg_auth_members entries that refer to
dropped roles.  (Commit 6566133c5 prevented that for the grantor
field, but not for the granted or grantee roles.)  We'll soon fix
that, at least in HEAD, but pg_dumpall needs to cope with the
situation in case of pre-existing inconsistency.  As pg_dumpall
stands, it will emit invalid commands like 'GRANT foo TO ""',
which causes pg_upgrade to fail.  Fix it to emit warnings and skip
those GRANTs, instead.

There was some discussion of removing the problem by changing
dumpRoleMembership's query to use JOIN not LEFT JOIN, but that
would result in silently ignoring such entries.  It seems better
to produce a warning.

Pre-v16 branches already coped with dangling grantor OIDs by simply
omitting the GRANTED BY clause.  I left that behavior as-is, although
it's somewhat inconsistent with the behavior of later branches.

Reported-by: Virender Singla <virender.cse@gmail.com>
Discussion: https://postgr.es/m/CAM6Zo8woa62ZFHtMKox6a4jb8qQ=w87R2L0K8347iE-juQL2EA@mail.gmail.com
Backpatch-through: 13
2025-02-21 13:37:15 -05:00
Masahiko Sawada
1aab680591 pg_upgrade: Add --set-char-signedness to set the default char signedness of new cluster.
This change adds a new option --set-char-signedness to pg_upgrade. It
enables user to set arbitrary signedness during pg_upgrade. This helps
cases where user who knew they copied the v17 source cluster from
x86 (signedness=true) to ARM (signedness=false) can pg_upgrade
properly without the prerequisite of acquiring an x86 VM.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
2025-02-21 10:23:39 -08:00
Masahiko Sawada
a8238f87f9 pg_upgrade: Preserve default char signedness value from old cluster.
Commit 44fe30fdab introduced the 'default_char_signedness' field in
controlfile. Newly created database clusters always set this field to
'signed'.

This change ensures that pg_upgrade updates the
'default_char_signedness' to 'unsigned' if the source database cluster
has signedness=false. For source clusters from v17 or earlier, which
lack the 'default_char_signedness' information, pg_upgrade assumes the
source cluster was initialized on the same platform where pg_upgrade
is running. It then sets the 'default_char_signedness' value according
to the current platform's default character signedness.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
2025-02-21 10:19:40 -08:00
Masahiko Sawada
30666d1857 pg_resetwal: Add --char-signedness option to change the default char signedness.
With the newly added option --char-signedness, pg_resetwal updates the
default char signedness flag in the controlfile. This option is
primarily intended for an upcoming patch that pg_upgrade supports
preserving the default char signedness during upgrades, and is not
meant for manual operation.

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
2025-02-21 10:14:36 -08:00
Masahiko Sawada
44fe30fdab Add default_char_signedness field to ControlFileData.
The signedness of the 'char' type in C is
implementation-dependent. For instance, 'signed char' is used by
default on x86 CPUs, while 'unsigned char' is used on aarch
CPUs. Previously, we accidentally let C implementation signedness
affect persistent data. This led to inconsistent results when
comparing char data across different platforms.

This commit introduces a new 'default_char_signedness' field in
ControlFileData to store the signedness of the 'char' type. While this
change does not encourage the use of 'char' without explicitly
specifying its signedness, this field can be used as a hint to ensure
consistent behavior for pre-v18 data files that store data sorted by
the 'char' type on disk (e.g., GIN and GiST indexes), especially in
cross-platform replication scenarios.

Newly created database clusters unconditionally set the default char
signedness to true. pg_upgrade (with an upcoming commit) changes this
flag for clusters if the source database cluster has
signedness=false. As a result, signedness=false setting will become
rare over time. If we had known about the problem during the last
development cycle that forced initdb (v8.3), we would have made all
clusters signed or all clusters unsigned. Making pg_upgrade the only
source of signedness=false will cause the population of database
clusters to converge toward that retrospective ideal.

Bump catalog version (for the catalog changes) and PG_CONTROL_VERSION
(for the additions in ControlFileData).

Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
2025-02-21 10:12:08 -08:00
Michael Paquier
41625ab8ea psql: Add support for pipelines
With \bind, \parse, \bind_named and \close, it is possible to issue
queries from psql using the extended protocol.  However, it was not
possible to send these queries using libpq's pipeline mode.  This
feature has two advantages:
- Testing.  Pipeline tests were only possible with pgbench, using TAP
tests.  It now becomes possible to have more SQL tests that are able to
stress the backend with pipelines and extended queries.  More tests will
be added in a follow-up commit that were discussed on some other
threads.  Some external projects in the community had to implement their
own facility to work around this limitation.
- Emulation of custom workloads, with more control over the actions
taken by a client with libpq APIs.  It is possible to emulate more
workload patterns to bottleneck the backend with the extended query
protocol.

This patch adds six new meta-commands to be able to control pipelines:
* \startpipeline starts a new pipeline.  All extended queries are queued
until the end of the pipeline are reached or a sync request is sent and
processed.
* \endpipeline ends an existing pipeline.  All queued commands are sent
to the server and all responses are processed by psql.
* \syncpipeline queues a synchronisation request, without flushing the
commands to the server, equivalent of PQsendPipelineSync().
* \flush, equivalent of PQflush().
* \flushrequest, equivalent of PQsendFlushRequest()
* \getresults reads the server's results for the queries in a pipeline.
Unsent data is automatically pushed when \getresults is called.  It is
possible to control the number of results read in a single meta-command
execution with an optional parameter, 0 means that all the results
should be read.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com
2025-02-21 11:19:59 +09:00
Michael Paquier
40af897eb7 Add braces for if block with large comment in psql's common.c
A patch touching this area of the code is under review, and this format
makes the readability of the code slightly harder to parse.

Extracted from a larger patch by the same author.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com
2025-02-21 09:18:49 +09:00
Peter Eisentraut
3e4d868615 Remove various unnecessary (char *) casts
Remove a number of (char *) casts that are unnecessary.  Or in some
cases, rewrite the code to make the purpose of the cast clearer.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
2025-02-20 19:49:27 +01:00
Jeff Davis
1fd1bd8710 Transfer statistics during pg_upgrade.
Add support to pg_dump for dumping stats, and use that during
pg_upgrade so that statistics are transferred during upgrade. In most
cases this removes the need for a costly re-analyze after upgrade.

Some statistics are not transferred, such as extended statistics or
statistics with a custom stakind.

Now pg_dump accepts the options --schema-only, --no-schema,
--data-only, --no-data, --statistics-only, and --no-statistics; which
allow all combinations of schema, data, and/or stats. The options are
named this way to preserve compatibility with the previous
--schema-only and --data-only options.

Statistics are in SECTION_DATA, unless the object itself is in
SECTION_POST_DATA.

The stats are represented as calls to pg_restore_relation_stats() and
pg_restore_attribute_stats().

Author: Corey Huinker, Jeff Davis
Reviewed-by: Jian He
Discussion: https://postgr.es/m/CADkLM=fzX7QX6r78fShWDjNN3Vcr4PVAnvXxQ4DiGy6V=0bCUA@mail.gmail.com
Discussion: https://postgr.es/m/CADkLM%3DcB0rF3p_FuWRTMSV0983ihTRpsH%2BOCpNyiqE7Wk0vUWA%40mail.gmail.com
2025-02-20 01:29:06 -08:00
Amit Kapila
4aa6fa3cd0 Include schema/table publications even with exclude options in dump.
The current implementation inconsistently includes public schema but not
information_schema when those are specified in FOR TABLES IN SCHMEA ...
Apart from that, the current behavior for publications w.r.t exclude table
and schema (--exclude-table, --exclude-schema) option differs from what we
do at other places. We try to avoid including publications for
corresponding tables or schemas when an exclude-table or exclude-schema
option is given, unlike what we do for views using functions defined in a
particular schema or a subscription pointing to publications with their
corresponding exclude options.

I decided not to backpatch this as it leads to a behavior change and we don't
see any field report for current behavior.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/1270733.1734134272@sss.pgh.pa.us
2025-02-20 11:25:29 +05:30
Andres Freund
d38bab5edd pgbench: Increase RLIMIT_NOFILE if necessary
pgbench already had code to check if the soft rlimit is too low for the
specified number of connections. If too low, it errored out, telling the user
to increase the limit.

However, we can do better: If the hard limit allows, increase the soft limit
to be sufficiently for the number of connections.

It is common for the soft limit to be considerably lower than the hard limit,
due to the danger of soft limits > 1024 breaking programs that use the
select(2), as explained in [1].

[1]: https://0pointer.net/blog/file-descriptor-limits.html

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAGECzQQh6VSy3KG4pN1d%3Dh9J%3DD1rStFCMR%2Bt7yh_Kwj-g87aLQ%40mail.gmail.com
2025-02-19 19:35:09 -05:00
Amit Kapila
ac0e33136a Invalidate inactive replication slots.
This commit introduces idle_replication_slot_timeout GUC that allows
inactive slots to be invalidated at the time of checkpoint. Because
checkpoints happen checkpoint_timeout intervals, there can be some lag
between when the idle_replication_slot_timeout was exceeded and when the
slot invalidation is triggered at the next checkpoint. To avoid such lags,
users can force a checkpoint to promptly invalidate inactive slots.

Note that the idle timeout invalidation mechanism is not applicable for
slots that do not reserve WAL or for slots on the standby server that are
synced from the primary server (i.e., standby slots having 'synced' field
'true'). Synced slots are always considered to be inactive because they
don't perform logical decoding to produce changes.

The slots can become inactive for a long period if a subscriber is down
due to a system error or inaccessible because of network issues. If such a
situation persists, it might be more practical to recreate the subscriber
rather than attempt to recover the node and wait for it to catch up which
could be time-consuming.

Then, external tools could create replication slots (e.g., for migrations
or upgrades) that may fail to remove them if an error occurs, leaving
behind unused slots that take up space and resources. Manually cleaning
them up can be tedious and error-prone, and without intervention, these
lingering slots can cause unnecessary WAL retention and system bloat.

As the duration of idle_replication_slot_timeout is in minutes, any test
using that would be time-consuming. We are planning to commit a follow up
patch for tests by using the injection point framework.

Author: Nisha Moond <nisha.moond412@gmail.com>
Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Discussion: https://postgr.es/m/OS0PR01MB5716C131A7D80DAE8CB9E88794FC2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-02-19 09:29:50 +05:30
Tom Lane
b464e51ab3 Update to latest Snowball sources.
It's been some time since we did this, partly because the upstream
snowball project hasn't formally tagged a new release since 2021.
The main motivation for doing it now is to absorb a bug fix
(their commit e322673a841d9abd69994ae8cd20e191090b6ef4), which
prevents a null pointer dereference crash if SN_create_env() gets
a malloc failure at just the wrong point.  We'll patch the back
branches with only that change, but we might as well do the full
sync dance on HEAD.

Aside from a bunch of mostly-minor tweaks to existing stemmers, this
update adds a new stemmer for Estonian.  It also removes the existing
stemmer for Romanian using ISO-8859-2 encoding.  Upstream apparently
concluded that ISO-8859-2 doesn't provide an adequate representation
of some Romanian characters, and the UTF-8 implementation should be
used instead.

While at it, update the README's instructions for doing a sync,
which have not been adjusted during the addition of meson tooling.

Thanks to Maksim Korotkov for discovering the null-pointer
bug and submitting the fix to upstream snowball.

Reported-by: Maksim Korotkov <m.korotkov@postgrespro.ru>
Discussion: https://postgr.es/m/1d1a46-67ab1000-21-80c451@83151435
2025-02-18 21:13:54 -05:00
Amit Kapila
217919dd09 Raise a WARNING for max_slot_wal_keep_size in pg_createsubscriber.
During the pg_createsubscriber execution, it is possible that the required
WAL is removed from the primary/publisher node due to
'max_slot_wal_keep_size'.

This patch raises a WARNING during the '--dry-run' mode if the
'max_slot_wal_keep_size' is set to a non-default value on the
primary/publisher node.

Author: Shubham Khanna <khannashubham1197@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CAHv8Rj+deqsQXOMa7Tck8CBQUbsua=+4AuMVQ2=MPM0f-ZHbjA@mail.gmail.com
2025-02-18 12:15:43 +05:30
Tomas Vondra
c407d5426b Add tab completion for ALTER USER/ROLE RESET
Currently tab completion for ALTER USER RESET shows a list of all
configuration parameters that may be set on a role, irrespectively of
which parameters are actually set. This patch improves tab completion to
offer only parameters that are set.

Author: Robins Tharakan
Reviewed-By: Tomas Vondra
Discussion: https://postgr.es/m/CAEP4nAzqiT6VbVC5r3nq5byLTnPzjniVGzEMpYcnAHQyNzEuaw%40mail.gmail.com
2025-02-17 18:12:15 +01:00
Tomas Vondra
9df8727c50 Add tab completion for ALTER DATABASE RESET
Currently tab completion for ALTER DATABASE RESET shows a list of all
configuration parameters that may be set on a database, irrespectively
of which parameters are actually set. This patch improves tab completion
to offer only parameters that are set.

Author: Robins Tharakan
Reviewed-By: Tomas Vondra
Discussion: https://postgr.es/m/CAEP4nAzqiT6VbVC5r3nq5byLTnPzjniVGzEMpYcnAHQyNzEuaw%40mail.gmail.com
2025-02-17 18:12:15 +01:00
Tom Lane
fd602f29c1 Clean up impenetrable logic in pg_basebackup/receivelog.c.
Coverity complained about possible double free of HandleCopyStream's
"copybuf".  AFAICS it's mistaken, but it is easy to see why it's
confused, because management of that buffer is impossibly confusing.
It's unreasonable that HandleEndOfCopyStream frees the buffer in some
cases but not others, updates the caller's state for that in no case,
and has not a single comment about how complicated that makes things.

Let's put all the responsibility for freeing copybuf in the actual
owner of that variable, HandleCopyStream.  This results in one more
PQfreemem call than before, but the logic is far easier to follow,
both for humans and machines.

Since this isn't (quite) actually broken, no back-patch.
2025-02-12 16:07:23 -05:00
Tom Lane
fcd77a6873 Fix minor memory leaks in pg_dump.
Coverity reported the two oversights in getPublicationTables.
Valgrind found the one in determineNotNullFlags.

The mistakes in getPublicationTables seem too minor to be worth
back-patching.  determineNotNullFlags could be run enough times
to matter, but that code is new in v18.  So, no back-patch.
2025-02-12 15:46:31 -05:00
Michael Paquier
5b94e27534 Fix some inconsistencies with memory freeing in pg_createsubscriber
The correct function documented to free the memory allocated for the
result returned by PQescapeIdentifier() and PQescapeLiteral() is
PQfreemem().  pg_createsubscriber.c relied on pg_free() instead, which
is not incorrect as both do a free() internally, but inconsistent with
the documentation.

While on it, this commit fixes a small memory leak introduced by
4867f8a555, as the code of pg_createsubscriber makes this effort.

Author: Ranier Vilela
Reviewed-by: Euler Taveira
Discussion: https://postgr.es/m/CAEudQAp=AW5dJXrGLbC_aZg_9nOo=42W7uLDRONFQE-gcgnkgQ@mail.gmail.com
Backpatch-through: 17
2025-02-12 17:11:43 +09:00
Melanie Plageman
d0d649e916 Limit pgbench COPY FREEZE to ordinary relations
pgbench client-side data generation uses COPY FREEZE to load data for most
tables. COPY FREEZE isn't supported for partitioned tables and since pgbench
only supports partitioning pgbench_accounts, pgbench used a hard-coded check to
skip COPY FREEZE and use plain COPY for a partitioned pgbench_accounts.

If the user has manually partitioned one of the other pgbench tables, this
causes client-side data generation to error out with:

ERROR:  cannot perform COPY FREEZE on a partitioned table

Fix this by limiting COPY FREEZE to ordinary tables (RELKIND_RELATION).

Author: Sergey Tatarintsev <s.tatarintsev@postgrespro.ru>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/flat/97f55fca-8a7b-4da8-b413-7d1c57010676%40postgrespro.ru
2025-02-11 16:52:08 -05:00
Melanie Plageman
052026c9b9 Eagerly scan all-visible pages to amortize aggressive vacuum
Aggressive vacuums must scan every unfrozen tuple in order to advance
the relfrozenxid/relminmxid. Because data is often vacuumed before it is
old enough to require freezing, relations may build up a large backlog
of pages that are set all-visible but not all-frozen in the visibility
map. When an aggressive vacuum is triggered, all of these pages must be
scanned. These pages have often been evicted from shared buffers and
even from the kernel buffer cache. Thus, aggressive vacuums often incur
large amounts of extra I/O at the expense of foreground workloads.

To amortize the cost of aggressive vacuums, eagerly scan some
all-visible but not all-frozen pages during normal vacuums.

All-visible pages that are eagerly scanned and set all-frozen in the
visibility map are counted as successful eager freezes and those not
frozen are counted as failed eager freezes.

If too many eager scans fail in a row, eager scanning is temporarily
suspended until a later portion of the relation. The number of failures
tolerated is configurable globally and per table.

To effectively amortize aggressive vacuums, we cap the number of
successes as well. Capping eager freeze successes also limits the amount
of potentially wasted work if these pages are modified again before the
next aggressive vacuum. Once we reach the maximum number of blocks
successfully eager frozen, eager scanning is disabled for the remainder
of the vacuum of the relation.

Original design idea from Robert Haas, with enhancements from
Andres Freund, Tomas Vondra, and me

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_ZF_KCzZuOrPrOqjGVe8iRVWEAJSpzMgRQs%3D5-v84cXUg%40mail.gmail.com
2025-02-11 13:53:48 -05:00
Andres Freund
3e98c8ce50 Specify the encoding of input to fmtId()
This commit adds fmtIdEnc() and fmtQualifiedIdEnc(), which allow to specify
the encoding as an explicit argument.  Additionally setFmtEncoding() is
provided, which defines the encoding when no explicit encoding is provided, to
avoid breaking all code using fmtId().

All users of fmtId()/fmtQualifiedId() are either converted to the explicit
version or a call to setFmtEncoding() has been added.

This commit does not yet utilize the now well-defined encoding, that will
happen in a subsequent commit.

Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Backpatch-through: 13
Security: CVE-2025-1094
2025-02-10 10:03:37 -05:00
Michael Paquier
169208092f Refactor TAP test code for file comparisons into new routine in Utils.pm
This unifies the output used should any differences be found in the
files provided, information that 027_stream_regress did not show on
failures.  TAP tests of pg_combinebackup and pg_upgrade now rely on the
refactored routine, reducing the dependency to the diff command.  The
callers of this routine can optionally specify a custom line-comparison
function.

There are a couple of tests that still use directly a diff command:
001_pg_bsd_indent, 017_shm and test_json_parser's 003.  These rely on
different properties and are left out for now.

Extracted from a larger patch by the same author.

Author: Ashutosh Bapat
Discussion: https://postgr.es/m/Z6RQS-tMzGYjlA-H@paquier.xyz
2025-02-09 16:52:33 +09:00
Tom Lane
fb056564ec Fix pgbench performance issue induced by commit af35fe501.
Commit af35fe501 caused "pgbench -i" to emit a '\r' character
for each data row loaded (when stderr is a terminal).
That's effectively invisible on-screen, but it causes the
connected terminal program to consume a lot of cycles.
It's even worse if you're connected over ssh, as the data
then has to pass through the ssh tunnel.

Simplest fix is to move the added logic inside the if-tests
that check whether to print a progress line.  We could do
it another way that avoids duplicating these few lines,
but on the whole this seems the most transparent way to
write it.

Like the previous commit, back-patch to all supported versions.

Reported-by: Andres Freund <andres@anarazel.de>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/4k4drkh7bcmdezq6zbkhp25mnrzpswqi2o75d5uv2eeg3aq6q7@b7kqdmzzwzgb
Backpatch-through: 13
2025-02-07 13:41:42 -05:00
Peter Eisentraut
83ea6c5402 Virtual generated columns
This adds a new variant of generated columns that are computed on read
(like a view, unlike the existing stored generated columns, which are
computed on write, like a materialized view).

The syntax for the column definition is

    ... GENERATED ALWAYS AS (...) VIRTUAL

and VIRTUAL is also optional.  VIRTUAL is the default rather than
STORED to match various other SQL products.  (The SQL standard makes
no specification about this, but it also doesn't know about VIRTUAL or
STORED.)  (Also, virtual views are the default, rather than
materialized views.)

Virtual generated columns are stored in tuples as null values.  (A
very early version of this patch had the ambition to not store them at
all.  But so much stuff breaks or gets confused if you have tuples
where a column in the middle is completely missing.  This is a
compromise, and it still saves space over being forced to use stored
generated columns.  If we ever find a way to improve this, a bit of
pg_upgrade cleverness could allow for upgrades to a newer scheme.)

The capabilities and restrictions of virtual generated columns are
mostly the same as for stored generated columns.  In some cases, this
patch keeps virtual generated columns more restricted than they might
technically need to be, to keep the two kinds consistent.  Some of
that could maybe be relaxed later after separate careful
considerations.

Some functionality that is currently not supported, but could possibly
be added as incremental features, some easier than others:

- index on or using a virtual column
- hence also no unique constraints on virtual columns
- extended statistics on virtual columns
- foreign-key constraints on virtual columns
- not-null constraints on virtual columns (check constraints are supported)
- ALTER TABLE / DROP EXPRESSION
- virtual column cannot have domain type
- virtual columns are not supported in logical replication

The tests in generated_virtual.sql have been copied over from
generated_stored.sql with the keyword replaced.  This way we can make
sure the behavior is mostly aligned, and the differences can be
visible.  Some tests for currently not supported features are
currently commented out.

Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Tested-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
2025-02-07 09:46:59 +01:00
Nathan Bossart
306dc520b9 Introduce autovacuum_vacuum_max_threshold.
One way autovacuum chooses tables to vacuum is by comparing the
number of updated or deleted tuples with a value calculated using
autovacuum_vacuum_threshold and autovacuum_vacuum_scale_factor.
The threshold specifies the base value for comparison, and the
scale factor specifies the fraction of the table size to add to it.
This strategy ensures that smaller tables are vacuumed after fewer
updates/deletes than larger tables, which is reasonable in many
cases but can result in infrequent vacuums on very large tables.
This is undesirable for a couple of reasons, such as very large
tables incurring a huge amount of bloat between vacuums.

This new parameter provides a way to set a limit on the value
calculated with autovacuum_vacuum_threshold and
autovacuum_vacuum_scale_factor so that very large tables are
vacuumed more frequently.  By default, it is set to 100,000,000
tuples, but it can be disabled by setting it to -1.  It can also be
adjusted for individual tables by changing storage parameters.

Author: Nathan Bossart <nathandbossart@gmail.com>
Co-authored-by: Frédéric Yhuel <frederic.yhuel@dalibo.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Michael Banck <mbanck@gmx.net>
Reviewed-by: Joe Conway <mail@joeconway.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Reviewed-by: Vinícius Abrahão <vinnix.bsd@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru>
Discussion: https://postgr.es/m/956435f8-3b2f-47a6-8756-8c54ded61802%40dalibo.com
2025-02-05 15:48:18 -06:00
Tom Lane
a14707da56 Show more-intuitive titles for psql commands \dt, \di, etc.
If exactly one relation type is requested in a command of the \dtisv
family, say "tables", "indexes", etc instead of "relations".  This
should cover the majority of actual uses, without creating a huge
number of new translatable strings.  The error messages for no
matching relations are adjusted as well.

In passing, invent "pg_log_error_internal()" to be used for frontend
error messages that don't seem to need translation, analogously to
errmsg_internal() in the backend.  The implementation is a bit cheesy,
being just a macro to prevent xgettext from recognizing a trigger
keyword.  This won't avoid a useless gettext lookup cycle at runtime
--- but surely we don't care about an extra microsecond or two in
what's supposed to be a can't-happen case.  I (tgl) also made
"pg_fatal_internal()", though it's not used in this patch.

Author: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAKAnmm+7o93fQV-RFkGaN1QnP-0D4d3JTykD+cLueqjDMKdfag@mail.gmail.com
2025-02-05 12:45:58 -05:00
Alexander Korotkov
ff1975ddd0 pg_controldata: Fix possible errors on corrupted pg_control
Protect against malformed timestamps.  Also protect against negative WalSegSz
as it triggers division by zero:

((0x100000000UL) / (WalSegSz)) can turn into zero in

XLogFileName(xlogfilename, ControlFile->checkPointCopy.ThisTimeLineID,
             segno, WalSegSz);

because if WalSegSz is -1 then by arithmetic rules in C we get
0x100000000UL / 0xFFFFFFFFFFFFFFFFUL == 0.

Author: Ilyasov Ian <ianilyasov@outlook.com>
Author: Anton Voloshin <a.voloshin@postgrespro.ru>
Backpatch-through: 13
2025-02-05 00:45:49 +02:00
Nathan Bossart
f3e4aeb744 vacuumdb: Add missing PQfinish() calls to vacuum_one_database().
A few of the version checks in vacuum_one_database() do not call
PQfinish() before exiting.  This precedent was unintentionally
established in commit 00d1e88d36, and while it's probably not too
problematic, it seems better to properly close the connection.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/Z6JAwqN1I8ljTuXp%40nathan
Backpatch-through: 13
2025-02-04 13:26:57 -06:00
Tom Lane
6cddecdfb0 Avoid breaking SJIS encoding while de-backslashing Windows paths.
When running on Windows, canonicalize_path() converts '\' to '/'
to prevent confusing the Windows command processor.  It was
doing that in a non-encoding-aware fashion; but in SJIS there
are valid two-byte characters whose second byte matches '\'.
So encoding corruption ensues if such a character is used in
the path.

We can fairly easily fix this if we know which encoding is
in use, but a lot of our utilities don't have much of a clue
about that.  After some discussion we decided we'd settle for
fixing this only in psql, and assuming that its value of
client_encoding matches what the user is typing.

It seems hopeless to get the server to deal with the problematic
characters in database path names, so we'll just declare that
case to be unsupported.  That means nothing need be done in
the server, nor in utility programs whose only contact with
file path names is for database paths.  But psql frequently
deals with client-side file paths, so it'd be good if it
didn't mess those up.

Bug: #18735
Reported-by: Koichi Suzuki <koichi.suzuki@enterprisedb.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Koichi Suzuki <koichi.suzuki@enterprisedb.com>
Discussion: https://postgr.es/m/18735-4acdb3998bb9f2b1@postgresql.org
Backpatch-through: 13
2025-01-29 14:24:36 -05:00
John Naylor
128897b101 Fix grammatical typos around possessive "its"
Some places spelled it "it's", which is short for "it is".
In passing, fix a couple other nearby grammatical errors.

Author: Jacob Brazeal <jacob.brazeal@gmail.com>
Discussion: https://postgr.es/m/CA+COZaAO8g1KJCV0T48=CkJMjAnnfTGLWOATz+2aCh40c2Nm+g@mail.gmail.com
2025-01-29 14:39:14 +07:00
Amit Kapila
75eb9766ec Rename pubgencols_type to pubgencols in pg_publication.
The column added in commit e65dbc9927, pubgencols_type, was inconsistent
with the naming conventions of other columns in the pg_publication
catalog.

Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/CALDaNm1u-ufVOW-RUsXSooqzkpohxfZYy=z78fbcr_9Pq5hbCg@mail.gmail.com
2025-01-28 10:42:46 +05:30
Michael Paquier
14793f4719 pg_amcheck: Fix test failure on Windows with non-existing role
For SSPI auth extra users need to be explicitly allowed, or we get
"SSPI authentication failed" instead of the expected "role does not
exist" error.

This report also means that the test has never worked on Windows since
its introduction in 9706092839, because it has always bumped on an
authentication failure rather than an error about the role not existing.

Oversight in eef4a33f62, that has added a pattern check on the error
generated by the command.

Per report from Tom Lane, via buildfarm member drongo.

Author: Dagfinn Ilmari Mannsåker
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/379085.1737734611@sss.pgh.pa.us
2025-01-27 08:00:19 +09:00
Tom Lane
04ace176e0 Tighten pg_restore's recognition of its -F (format) option values.
Instead of checking just the first letter, match the whole string
using pg_strcasecmp.  Per the documentation, we allow either just
the first letter (e.g. "c") or the whole name ("custom"); but we
will no longer accept random variations such as "chump".  This
matches pg_dump's longstanding parsing code for the same option.

Also for consistency with pg_dump, recognize "p"/"plain".  We don't
support it, but we can give a more helpful error message than
"unrecognized archive format".

Author: Srinath Reddy <srinath2133@gmail.com>
Discussion: https://postgr.es/m/CAFC+b6pfK-BGcWW1kQmtxVrCh-JGjB2X02rLPQs_ZFaDGjZDsQ@mail.gmail.com
2025-01-25 11:24:16 -05:00
Michael Paquier
fd4c4ede70 initdb: Convert tests to use long options with fat comma style
This is similar to ce1b0f9da0, but this time this rule is applied to
some of the TAP tests of initdb.

Author: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/878qr146ra.fsf@wibble.ilmari.org
2025-01-24 15:19:38 +09:00
Peter Eisentraut
473a575e05 Return yyparse() result not via global variable
Instead of passing the parse result from yyparse() via a global
variable, pass it via a function output argument.

This complements earlier work to make the parsers reentrant.

Discussion: Discussion: https://www.postgresql.org/message-id/flat/eb6faeac-2a8a-4b69-9189-c33c520e5b7b@eisentraut.org
2025-01-24 06:55:39 +01:00
Amit Kapila
b35434b134 Fix buildfarm failure introduced by commit e65dbc9927.
The patch had incorrectly specified the default value for
publish_generated_columns during the query formation in pg_dump.

Author: Vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CAA4eK1KfZYTD8Hpi9TD1KaB8rNUBR9baUvTxa5wYyZDGbEaa6g@mail.gmail.com
2025-01-23 17:47:15 +05:30
Amit Kapila
e65dbc9927 Change publication's publish_generated_columns option type to enum.
The current boolean publish_generated_columns option only supports a
binary choice, which is insufficient for future enhancements where
generated columns can be of different types (e.g., stored or virtual). The
supported values for the publish_generated_columns option are 'none' and
'stored'.

Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/d718d219-dd47-4a33-bb97-56e8fc4da994@eisentraut.org
Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com
2025-01-23 15:28:37 +05:30
Michael Paquier
eef4a33f62 Add error pattern checks for some TAP tests for non-existing objects
Some tests are updated to use command_fails_like(), gaining a check for
the error output generated.  The test changed in pg_amcheck has come up
after noticing that an incorrect option name still made the test to
pass, while the command failed.  The three other tests changed in
src/bin/scripts/ have been noticed by me, in passing.

Author: Dagfinn Ilmari Mannsåker, Michael Paquier
Discussion: https://postgr.es/m/87bjvy50cs.fsf@wibble.ilmari.org
2025-01-23 16:03:48 +09:00
Michael Paquier
858b4db378 Improve TAP tests of pg_basebackup
This addresses some minor issues with the TAP tests of pg_basebackup:
- Remove three duplicated tests used for incorrect option combinations.
- Add more pattern checks for commands doomed to fail, to make sure that
the error generated is the expected one.  These are for tests related to
the tablespace mapping and incorrect option combinations.
- Fix the description of one test for the case of backup target versus
format.

Issues noticed while reviewing this area of the tests.

Discussion: https://postgr.es/m/87bjvy50cs.fsf@wibble.ilmari.org
2025-01-23 15:15:36 +09:00
Amit Kapila
991974bb48 Fix \dRp+ output when describing publications with a lower server version.
The psql was not careful that the new column "Generated columns" won't be
present in the lower version. This was introduced in recent commit
7054186c4e.

Author: Vignesh C
Reviewed-by: Peter Smith
Discussion: https://postgr.es/m/CALDaNm3OcXdY0EzDEKAfaK9gq2B67Mfsgxu93+_249ohyts=0g@mail.gmail.com
2025-01-22 15:27:37 +05:30
Michael Paquier
ce1b0f9da0 Improve grammar of options for command arrays in TAP tests
This commit rewrites a good chunk of the command arrays in TAP tests
with a grammar based on the following rules:
- Fat commas are used between option names and their values, making it
clear to both humans and perltidy that values and names are bound
together.  This is particularly useful for the readability of multi-line
command arrays, and there are plenty of them in the TAP tests.  Most of
the test code is updated to use this style.  Some commands used
parenthesis to show the link, or attached values and options in a single
string.  These are updated to use fat commas instead.
- Option names are switched to use their long names, making them more
self-documented.  Based on a suggestion by Andrew Dunstan.
- Add some trailing commas after the last item in multi-line arrays,
which is a common perl style.

Not all the places are taken care of, but this covers a very good chunk
of them.

Author: Dagfinn Ilmari Mannsåker
Reviewed-by: Michael Paquier, Peter Smith, Euler Taveira
Discussion: https://postgr.es/m/87jzc46d8u.fsf@wibble.ilmari.org
2025-01-22 14:47:13 +09:00
Michael Paquier
be31ac2519 Run perltidy
A follow-up patch will adjust the TAP tests to follow a more-structured
format for option lists in commands, that perltidy is able to cope
better with.  Putting the tree first in a clean state makes the next
change a bit easier.  v20230309 has been used.

Author: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/87jzc46d8u.fsf@wibble.ilmari.org
2025-01-22 10:15:32 +09:00
Jeff Davis
d3d0983169 Support PG_UNICODE_FAST locale in the builtin collation provider.
The PG_UNICODE_FAST locale uses code point sort order (fast,
memcmp-based) combined with Unicode character semantics. The character
semantics are based on Unicode full case mapping.

Full case mapping can map a single codepoint to multiple codepoints,
such as "ß" uppercasing to "SS". Additionally, it handles
context-sensitive mappings like the "final sigma", and it uses
titlecase mappings such as "Dž" when titlecasing (rather than plain
uppercase mappings).

Importantly, the uppercasing of "ß" as "SS" is specifically mentioned
by the SQL standard. In Postgres, UCS_BASIC uses plain ASCII semantics
for case mapping and pattern matching, so if we changed it to use the
PG_UNICODE_FAST locale, it would offer better compliance with the
standard. For now, though, do not change the behavior of UCS_BASIC.

Discussion: https://postgr.es/m/ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com
Discussion: https://postgr.es/m/27bb0e52-801d-4f73-a0a4-02cfdd4a9ada@eisentraut.org
Reviewed-by: Peter Eisentraut, Daniel Verite
2025-01-17 15:56:30 -08:00
Nathan Bossart
6a9b2a631a vacuumdb: Fix comment for vacuum_one_database().
Since commit e0c2933a76, vacuum_one_database() always uses a
catalog query to discover the tables to process, but this comment
still notes the special case for which we used a catalog query
before that commit.  Let's just remove that note.

Also, commit 7781f4e3e7 renamed the "tables" parameter to "objects"
but missed updating this comment.  This commit fixes that as well.
2025-01-17 15:23:14 -06:00
Nathan Bossart
5cda4fdb0b Avoid calling pqsignal() with invalid signals on Windows frontends.
As noted by the comment at the top of port/pqsignal.c, Windows
frontend programs can only use pqsignal() with the 6 signals
required by C.  Most places avoid using invalid signals via #ifndef
WIN32, but initdb and pg_test_fsync check whether the signal itself
is defined, which doesn't work because win32_port.h defines many
extra signals for the signal emulation code.  pg_regress seems to
have missed the memo completely.  These issues aren't causing any
real problems today because nobody checks the return value of
pqsignal(), but a follow-up commit will add some error checking.

To fix, surround all frontend calls to pqsignal() that use signals
that are invalid on Windows with #ifndef WIN32.  We cannot simply
skip defining the extra signals in win32_port.h for frontends
because they are needed in places such as pgkill().

Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/Z4chOKfnthRH71mw%40nathan
2025-01-16 15:56:39 -06:00