If the client sent a query cancel request with backend PID 0, it
tripped an assertion. With assertions disabled, you got this in the
log instead:
LOG: invalid cancel request with PID 0
LOG: wrong key in cancel request for process 0
Query cancellations don't even require authentication, so we better
tolerate bogus requests. Fix by turning the assertion into a regular
runtime check.
Spotted while testing libpq behavior with a modified server that
didn't send BackendKeyData to the client.
Backpatch-through: 18
For many optional libraries, we extract the -L and -l switches needed
to link the library from a helper program such as llvm-config. In
some cases we put the resulting -L switches into LDFLAGS ahead of
-L switches specified via --with-libraries. That risks breaking
the user's intention for --with-libraries.
It's not such a problem if the library's -L switch points to a
directory containing only that library, but on some platforms a
library helper may "helpfully" offer a switch such as -L/usr/lib
that points to a directory holding all standard libraries. If the
user specified --with-libraries in hopes of overriding the standard
build of some library, the -L/usr/lib switch prevents that from
happening since it will come before the user-specified directory.
To fix, avoid inserting these switches directly into LDFLAGS during
configure, instead adding them to LIBDIRS or SHLIB_LINK. They will
still eventually get added to LDFLAGS, but only after the switches
coming from --with-libraries.
The same problem exists for -I switches: those coming from
--with-includes should appear before any coming from helper programs
such as llvm-config. We have not heard field complaints about this
case, but it seems certain that a user attempting to override a
standard library could have issues.
The changes for this go well beyond configure itself, however,
because many Makefiles have occasion to manipulate CPPFLAGS to
insert locally-desirable -I switches, and some of them got it wrong.
The correct ordering is any -I switches pointing at within-the-
source-tree-or-build-tree directories, then those from the tree-wide
CPPFLAGS, then those from helper programs. There were several places
that risked pulling in a system-supplied copy of libpq headers, for
example, instead of the in-tree files. (Commit cb36f8ec2 fixed one
instance of that a few months ago, but this exercise found more.)
The Meson build scripts may or may not have any comparable problems,
but I'll leave it to someone else to investigate that.
Reported-by: Charles Samborski <demurgos@demurgos.net>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/70f2155f-27ca-4534-b33d-7750e20633d7@demurgos.net
Backpatch-through: 13
When I prepared 71c0921b6 et al yesterday, I was thinking that the
logic involving explicitly freeing the node_list output was still
needed to dodge leakage bugs in libxml2. But I was misremembering:
we introduced that only because with early 2.13.x releases we could
not trust xmlParseBalancedChunkMemory's result code, so we had to
look to see if a node list was returned or not. There's no reason
to believe that xmlParseBalancedChunkMemory will fail to clean up
the node list when required, so simplify. (This essentially
completes reverting all the non-cosmetic changes in 6082b3d5d.)
Reported-by: Jim Jones <jim.jones@uni-muenster.de>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/997668.1753802857@sss.pgh.pa.us
Backpatch-through: 13
This mostly reverts commit 6082b3d5d, "Use xmlParseInNodeContext
not xmlParseBalancedChunkMemory". It turns out that
xmlParseInNodeContext will reject text chunks exceeding 10MB, while
(in most libxml2 versions) xmlParseBalancedChunkMemory will not.
The bleeding-edge libxml2 bug that we needed to work around a year
ago is presumably no longer a factor, and the argument that
xmlParseBalancedChunkMemory is semi-deprecated is not enough to
justify a functionality regression. Hence, go back to doing it
the old way.
Reported-by: Michael Paquier <michael@paquier.xyz>
Author: Michael Paquier <michael@paquier.xyz>
Co-authored-by: Erik Wienhold <ewie@ewie.name>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/aIGknLuc8b8ega2X@paquier.xyz
Backpatch-through: 13
Commit 473a575e05 purported to make this
function stash the error message in *syncrep_parse_result_p, but
it didn't actually.
As a result, an attempt to set synchronous_standby_names to any value
that does not parse resulted in a generic "parser failed." message
rather than anything more specific. This fixes that.
Discussion: http://postgr.es/m/CA+TgmoYF9wPNZ-Q_EMfib_espgHycY-eX__6Tzo2GpYpVXqCdQ@mail.gmail.com
Backpatch-through: 18
The callback added in fc415edf8c used to check if there is any pending
data to flush for fixed-numbered statistics, done by looping across all
the builtin and custom stats kinds with a call to have_fixed_pending_cb,
is proving to able to show in workloads that do not report any stats
(read-only, no function calls, no WAL, no IO, etc). The code used in
v17 was cheaper than that what HEAD has introduced, relying on three
boolean checks for WAL, SLRU and IO stats.
This commit switches the code to use a more efficient approach than
fc415edf8c, with a single boolean flag that can be switched to "true"
by any fixed-numbered stats kinds to force pgstat_report_stat() to go
through one round of reports. The flag is reset by pgstat_report_stat()
once a full round of reports is done. The flag being false means that
fixed-numbered stats kinds saw no activity, and that there is no pending
data to flush.
ac000fca74 took one step in improving the performance by reducing the
number of stats kinds that the backend can hold. This commit takes a
more drastic step by bringing back the code efficiency to what it was
before v18 with a cheap check at the beginning of pgstat_report_stat()
for its fast-exit path.
The callback have_static_pending_cb is removed as an effect of all that.
Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/eb224uegsga2hgq7dfq3ps5cduhpqej7ir2hjxzzozjthrekx5@dysei6buqthe
Backpatch-through: 18
If the number of sync requests is big enough, the palloc() call in
AbsorbSyncRequests() will attempt to allocate more than 1 GB of memory,
resulting in failure. This can lead to an infinite loop in the checkpointer
process, as it repeatedly fails to absorb the pending requests.
This commit limits the checkpointer requests queue size to 10M items. In
addition to preventing the palloc() failure, this change helps to avoid long
queue processing time.
Also, this commit is for backpathing only. The master branch receives
a more invasive yet comprehensive fix for this problem.
Discussion: https://postgr.es/m/db4534f83a22a29ab5ee2566ad86ca92%40postgrespro.ru
Backpatch-through: 13
Previously, if a background worker crashed (e.g., due to a SIGKILL) and
the server restarted due to restart_after_crash being enabled,
the worker was not restarted as expected. Background workers without
the never-restart flag should automatically restart in this case.
This issue was introduced in commit 28a520c0b7, which failed to reset
the rw_pid field in the RegisteredBgWorker struct for the crashed worker.
This commit fixes the problem by resetting rw_pid for all eligible
background workers during the crash-and-restart cycle.
Back-patched to v18, where the bug was introduced.
Bug fix patches were proposed by Andrey Rudometov and ChangAo Chen,
but this commit uses a different approach.
Reported-by: Andrey Rudometov <unlimitedhikari@gmail.com>
Reported-by: ChangAo Chen <cca5507@qq.com>
Author: Andrey Rudometov <unlimitedhikari@gmail.com>
Author: ChangAo Chen <cca5507@qq.com>
Co-authored-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: ChangAo Chen <cca5507@qq.com>
Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Discussion: https://postgr.es/m/CAF6JsWiO=i24qYitWe6ns1sXqcL86rYxdyU+pNYk-WueKPSySg@mail.gmail.com
Discussion: https://postgr.es/m/tencent_E00A056B3953EE6440F0F40F80EC30427D09@qq.com
Backpatch-through: 18
LatchWaitSetPostmasterDeathPos, the latch event position for the
postmaster death event, is initialized under IsUnderPostmaster.
WaitLatch() considered it as a valid wait target in single-user mode
(!IsUnderPostmaster), which was incorrect.
One code path found to fail with an assertion failure is a database drop
in single-user mode while waiting in WaitForProcSignalBarrier() after
the drop.
Oversight in commit 84e5b2f07a.
Author: Patrick Stählin <me@packi.ch>
Co-authored-by: Ronan Dunklau <ronan.dunklau@aiven.io>
Discussion: https://postgr.es/m/18996-3a2744c8140488de@postgresql.org
Backpatch-through: 18
This commit changes stats kinds to have the following bounds, making
their handling in core cheaper by default:
- PGSTAT_KIND_CUSTOM_MIN 128 -> 24
- PGSTAT_KIND_MAX 256 -> 32
The original numbers were rather high, and showed an impact on
performance in pgstat_report_stat() for the case of simple queries with
its early-exit path if there are no pending statistics to flush. This
logic will be improved more in a follow-up commit to bring the
performance of pgstat_report_stat() on par with v17 and older versions.
Lowering the bounds is a change worth doing on its own, independently of
the other improvement.
These new numbers should be enough to leave some room for the following
years for built-in and custom stats kinds, with stable ID numbers. At
least that should be enough to start with this facility for extension
developers. It can be always increased in the tree depending on the
requirements wanted.
Per discussion with Andres Freund and Bertrand Drouvot.
Discussion: https://postgr.es/m/eb224uegsga2hgq7dfq3ps5cduhpqej7ir2hjxzzozjthrekx5@dysei6buqthe
Backpatch-through: 18
Previously, the tool could replay the same transaction twice, once during
recovery, then again during replication after the subscriber was set up.
This occurred because the same recovery_target_lsn was used both to
finalize recovery and to start replication. If
recovery_target_inclusive = true, the transaction at that LSN would be
applied during recovery and then sent again by the publisher leading to
duplication.
To prevent this, we now set recovery_target_inclusive = false. This
ensures the transaction at recovery_target_lsn is not reapplied during
recovery, avoiding duplication when replication begins.
Bug #18897
Reported-by: Zane Duffield <duffieldzane@gmail.com>
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 17, where it was introduced
Discussion: https://postgr.es/m/18897-d3db67535860dddb@postgresql.org
Solaris has never bothered to add "const" to the second argument
of PAM conversation procs, as all other Unixen did decades ago.
This resulted in an "incompatible pointer" compiler warning when
building --with-pam, but had no more serious effect than that,
so we never did anything about it. However, as of GCC 14 the
case is an error not warning by default.
To complicate matters, recent OpenIndiana (and maybe illumos
in general?) *does* supply the "const" by default, so we can't
just assume that platforms using our solaris template need help.
What we can do, short of building a configure-time probe,
is to make solaris.h #define _PAM_LEGACY_NONCONST, which
causes OpenIndiana's pam_appl.h to revert to the traditional
definition, and hopefully will have no effect anywhere else.
Then we can use that same symbol to control whether we include
"const" in the declaration of pam_passwd_conv_proc().
Bug: #18995
Reported-by: Andrew Watkins <awatkins1966@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18995-82058da9ab4337a7@postgresql.org
Backpatch-through: 13
This index AM callback has been introduced in c1ec02be1d and it is
optional, currently only being used by BRIN. Optional callbacks are
documented with NULL as possible value in amapi.h and indexam.sgml, but
this callback has missed this part of the description.
Reported-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Discussion: https://postgr.es/m/CAHut+PvgYcPmPDi1YdHMJY5upnyGRpc0N8pk1xNB11xDSBwNog@mail.gmail.com
Backpatch-through: 17
ECPGconnect() caches established connections to the server, supporting
the case of a NULL connection name when a database name is not specified
by its caller.
A follow-up call to ECPGget_PGconn() to get an established connection
from the cached set with a non-NULL name could cause a NULL pointer
dereference if a NULL connection was listed in the cache and checked for
a match. At least two connections are necessary to reproduce the issue:
one with a NULL name and one with a non-NULL name.
Author: Aleksander Alekseev <aleksander@tigerdata.com>
Discussion: https://postgr.es/m/CAJ7c6TNvFTPUTZQuNAoqgzaSGz-iM4XR61D7vEj5PsQXwg2RyA@mail.gmail.com
Backpatch-through: 13
Commit e5da0fe3c2 introduced catalog entries for not-null constraints
on domains; but because commit b0e96f3119 (the original work for
catalogued not-null constraints on tables) forgot to teach pg_dump to
process the comments for them, this one also forgot. Add that now.
We also need to teach repairDependencyLoop() about the new type of
constraints being possible for domains.
Backpatch-through: 17
Co-authored-by: jian he <jian.universality@gmail.com>
Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de>
Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxF-0bqVR=j4jonS6N2Ka6hHUpFyu3_3TWKNhOW_4yFSSg@mail.gmail.com
This commit is only for HEAD and v18, where the test has been removed.
It also incorporates improvements below to stability and coverage of the
original test, which were already backpatched to v17.
- Add one pg_logical_emit_message() call to force the creation of a record
that spawns across two pages.
- Make the logic wait for the checkpoint completion.
Author: Alexander Korotkov <akorotkov@postgresql.org>
Co-authored-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Backpatch-through: 18
Currently, the comments in 047_checkpoint_physical_slot. It shows an
incomplete intention to wait for checkpoint completion before performing
an immediate database stop. However, an immediate node stop can occur both
before and after checkpoint completion. Both cases should work correctly.
But we would like the test to be more stable and deterministic. This is why
this commit makes this test explicitly wait for the checkpoint completion
log message.
Discussion: https://postgr.es/m/CAPpHfdurV-j_e0pb%3DUFENAy3tyzxfF%2ByHveNDNQk2gM82WBU5A%40mail.gmail.com
Discussion: https://postgr.es/m/aHXLep3OaX_vRTNQ%40paquier.xyz
Author: Alexander Korotkov <akorotkov@postgresql.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Backpatch-through: 17
If a crash occurs while writing a WAL record that spans multiple pages, the
recovery process marks the page with the XLP_FIRST_IS_OVERWRITE_CONTRECORD
flag. However, logical decoding currently attempts to read the full WAL
record based on its expected size before checking this flag, which can lead
to an infinite wait if the remaining data is never written (e.g., no activity
after crash).
This patch updates the logic first to read the page header and check for
the XLP_FIRST_IS_OVERWRITE_CONTRECORD flag before attempting to reconstruct
the full WAL record. If the flag is set, decoding correctly identifies
the record as incomplete and avoids waiting for WAL data that will never
arrive.
Discussion: https://postgr.es/m/CAAKRu_ZCOzQpEumLFgG_%2Biw3FTa%2BhJ4SRpxzaQBYxxM_ZAzWcA%40mail.gmail.com
Discussion: https://postgr.es/m/CALDaNm34m36PDHzsU_GdcNXU0gLTfFY5rzh9GSQv%3Dw6B%2BQVNRQ%40mail.gmail.com
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Backpatch-through: 13
If a MERGE inside a CTE attempts an UPDATE or DELETE on a table with
BEFORE ROW triggers, and a concurrent UPDATE or DELETE happens, the
merge code would fail (crashing in the case of an UPDATE action, and
potentially executing the wrong action for a DELETE action).
This is the same issue that 9321c79c86 attempted to fix, except now
for a MERGE inside a CTE. As noted in 9321c79c86, what needs to happen
is for the trigger code to exit early, returning the TM_Result and
TM_FailureData information to the merge code, if a concurrent
modification is detected, rather than attempting to do an EPQ
recheck. The merge code will then do its own rechecking, and rescan
the action list, potentially executing a different action in light of
the concurrent update. In particular, the trigger code must never call
ExecGetUpdateNewTuple() for MERGE, since that is bound to fail because
MERGE has its own per-action projection information.
Commit 9321c79c86 did this using estate->es_plannedstmt->commandType
in the trigger code to detect that a MERGE was being executed, which
is fine for a plain MERGE command, but does not work for a MERGE
inside a CTE. Fix by passing that information to the trigger code as
an additional parameter passed to ExecBRUpdateTriggers() and
ExecBRDeleteTriggers().
Back-patch as far as v17 only, since MERGE cannot appear inside a CTE
prior to that. Additionally, take care to preserve the trigger ABI in
v17 (though not in v18, which is still in beta).
Bug: #18986
Reported-by: Yaroslav Syrytsia <me@ys.lc>
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18986-e7a8aac3d339fa47@postgresql.org
Backpatch-through: 17
This is the documented behavior, and it worked that way before
v10. However, addition of the connhost[] array created cases
where conn->connhost[conn->whichhost].port is NULL. The rest
of libpq is careful to substitute DEF_PGPORT[_STR] for a null
or empty port string, but we failed to do so here, leading to
possibly returning NULL. As of v18 that causes psql's \conninfo
command to segfault. Older psql versions avoid that, but it's
pretty likely that other clients have trouble with this,
so we'd better back-patch the fix.
In stable branches, just revert to our historical behavior of
returning an empty string when there was no user-given port
specification. However, it seems substantially more useful and
indeed more correct to hand back DEF_PGPORT_STR in such cases,
so let's make v18 and master do that.
Author: Daniele Varrazzo <daniele.varrazzo@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA+mi_8YTS8WPZPO0PAb2aaGLwHuQ0DEQRF0ZMnvWss4y9FwDYQ@mail.gmail.com
Backpatch-through: 13
We have an assertion to ensure that a command tag has been assigned by
the time we're done executing, but if we happen to execute a command
with no queries, the assertion would fail. Per discussion, rather than
contort things to get a tag assigned, just remove the assertion.
Oversight in 2f9661311b. That commit also retained a comment that
explained logic that had been adjacent to it but diffused into various
places, leaving none apt to keep part of the comment. Remove that part,
and rewrite what remains for extra clarity.
Bug: #18984
Backpatch-through: 13
Reported-by: Aleksander Alekseev <aleksander@tigerdata.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18984-0f4778a6599ac3ae@postgresql.org
The terms used in wait_event_names.txt and lwlock.c were inconsistent
for MultiXactOffsetSLRU and MultiXactMemberSLRU, which could cause joins
between pg_wait_events and pg_stat_activity to fail. lwlock.c is
adjusted in this commit to what the historical name of the event has
always been, and what is documented.
Oversight in 53c2a97a92. 08b9b9e043 has fixed a similar
inconsistency some time ago.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/aHdxN0D0hKXzHFQG@ip-10-97-1-34.eu-west-3.compute.internal
Backpatch-through: 17
We skip dumping constraints together with domains if they are invalid
('separate') so that they appear after data -- but their comments were
dumped together with the domain definition, which in effect leads to the
comment being dumped when the constraint does not yet exist. Delay
them in the same way.
Oversight in 7eca575d1c28; backpatch all the way back.
Author: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxF_C2pe6J_+nPr6C5jf5rQnbYP8XOKr4HM8yHZtp2aQqQ@mail.gmail.com
Previously, pg_dumpall would still dump global objects such as roles
and tablespaces even when --statistics-only or --no-schema was specified.
Since these global objects are treated as schema-level data, they should
be skipped in these cases.
This commit fixes the issue by ensuring that global objects are not
dumped when either --statistics-only or --no-schema is used.
Author: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/08129593-6f3c-4fb9-94b7-5aa2eefb99b0@oss.nttdata.com
Backpatch-through: 18
This adjusts the wording to match the changes in commits
5987553fde, a233a603ba, and pgweb commit 2d764dbc08.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/aHVo791guQR6uqwT%40nathan
Backpatch-through: 13
The grammar was a little shaky and confusing here, so word-smith it
a bit. Also, adjust the comments in pg_ident.conf.sample to use the
same terminology as the SGML docs, in particular "DATABASE-USERNAME"
not "PG-USERNAME".
Back-patch appropriate subsets. I did not risk changing
pg_ident.conf.sample in released branches, but it still seems OK
to change it in v18.
Reported-by: Alexey Shishkin <alexey.shishkin@enterprisedb.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/175206279327.3157504.12519088928605422253@wrigleys.postgresql.org
Backpatch-through: 13
Because not every path through JsonbIteratorNext() sets val->type,
some compilers complain that compareJsonbContainers() is comparing
possibly-uninitialized values. The paths that don't set it return
WJB_DONE, WJB_END_ARRAY, or WJB_END_OBJECT, so it's clear by
manual inspection that the "(ra == rb)" code path is safe, and
indeed we aren't seeing warnings about that. But the (ra != rb)
case is much less obviously safe. In Assert-enabled builds it
seems that the asserts rejecting WJB_END_ARRAY and WJB_END_OBJECT
persuade gcc 15.x not to warn, which makes little sense because
it's impossible to believe that the compiler can prove of its
own accord that ra/rb aren't WJB_DONE here. (In fact they never
will be, so the code isn't wrong, but why is there no warning?)
Without Asserts, the appearance of warnings is quite unsurprising.
We discussed fixing this by converting those two Asserts into
pg_assume, but that seems not very satisfactory when it's so unclear
why the compiler is or isn't warning: the warning could easily
reappear with some other compiler version. Let's fix it in a less
magical, more future-proof way by changing JsonbIteratorNext()
so that it always does set val->type. The cost of that should be
pretty negligible, and it makes the function's API spec less squishy.
Reported-by: Erik Rijkers <er@xs4all.nl>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/988bf1bc-3f1f-99f3-bf98-222f1cd9dc5e@xs4all.nl
Discussion: https://postgr.es/m/0c623e8a204187b87b4736792398eaf1@postgrespro.ru
Backpatch-through: 13
method_worker.c installed SignalHandlerForConfigReload, but it failed to
actually process reload requests. That hasn't yet produced any concrete
problem reports in terms of GUC changes it should have cared about in
v18, but it was inconsistent.
It did cause problems for a couple of patches in development that need
IO workers to react to ALTER SYSTEM + pg_reload_conf(). Fix extracted
from one of those patches.
Back-patch to 18.
Reported-by: Dmitry Dolgov <9erthalion6@gmail.com>
Discussion: https://postgr.es/m/sh5uqe4a4aqo5zkkpfy5fobe2rg2zzouctdjz7kou4t74c66ql%40yzpkxb7pgoxf
In an ancient ancestor of this code, the postmaster assigned IDs to IO
workers. Now it tracks them in an unordered array and doesn't know
their IDs, so it might be confusing to readers that it still referred to
their indexes as IDs.
No change in behavior, just variable name and error message cleanup.
Back-patch to 18.
Discussion: https://postgr.es/m/CA%2BhUKG%2BwbaZZ9Nwc_bTopm4f-7vDmCwLk80uKDHj9mq%2BUp0E%2Bg%40mail.gmail.com
Adopt PgAioXXX convention for pgaio module type names. Rename a
function that didn't use a pgaio_worker_ submodule prefix. Rename the
internal submit function's arguments to match the indirectly relevant
function pointer declaration and nearby examples. Rename the array of
handle IDs in PgAioSubmissionQueue to sqes, a term of art seen in the
systems it emulates, also clarifying that they're not IO handle
pointers as the old name might imply.
No change in behavior, just type, variable and function name cleanup.
Back-patch to 18.
Discussion: https://postgr.es/m/CA%2BhUKG%2BwbaZZ9Nwc_bTopm4f-7vDmCwLk80uKDHj9mq%2BUp0E%2Bg%40mail.gmail.com
getid() and putid(), which parse and deparse role names within ACL
input/output, applied isalnum() to see if a character within a role
name requires quoting. They did this even for non-ASCII characters,
which is problematic because the results would depend on encoding,
locale, and perhaps even platform. So it's possible that putid()
could elect not to quote some string that, later in some other
environment, getid() will decide is not a valid identifier, causing
dump/reload or similar failures.
To fix this in a way that won't risk interoperability problems
with unpatched versions, make getid() treat any non-ASCII as a
legitimate identifier character (hence not requiring quotes),
while making putid() treat any non-ASCII as requiring quoting.
We could remove the resulting excess quoting once we feel that
no unpatched servers remain in the wild, but that'll be years.
A lesser problem is that getid() did the wrong thing with an input
consisting of just two double quotes (""). That has to represent an
empty string, but getid() read it as a single double quote instead.
The case cannot arise in the normal course of events, since we don't
allow empty-string role names. But let's fix it while we're here.
Although we've not heard field reports of problems with non-ASCII
role names, there's clearly a hazard there, so back-patch to all
supported versions.
Reported-by: Peter Eisentraut <peter@eisentraut.org>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3792884.1751492172@sss.pgh.pa.us
Backpatch-through: 13
Previously, the check_hook functions for max_slot_wal_keep_size and
idle_replication_slot_timeout would incorrectly raise an ERROR for values
set in postgresql.conf during upgrade, even though those values were not
actively used in the upgrade process.
To prevent logical slot invalidation during upgrade, we used to set
special values for these GUCs. Now, instead of relying on those values, we
directly prevent WAL removal and logical slot invalidation caused by
max_slot_wal_keep_size and idle_replication_slot_timeout.
Note: PostgreSQL 17 does not include the idle_replication_slot_timeout
GUC, so related changes were not backported.
BUG #18979
Reported-by: jorsol <jorsol@gmail.com>
Author: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed by: vignesh C <vignesh21@gmail.com>
Reviewed by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Backpatch-through: 17, where it was introduced
Discussion: https://postgr.es/m/219561.1751826409@sss.pgh.pa.us
Discussion: https://postgr.es/m/18979-a1b7fdbb7cd181c6@postgresql.org
This commit updates the documentation to clarify that "idle" in
idle_replication_slot_timeout means the replication slot is inactive,
that is, not currently used by any replication connection.
Without this clarification, "idle" could be misinterpreted to mean
that the slot is not advancing or that no data is being streamed,
even if a connection exists.
Back-patch to v18 where idle_replication_slot_timeout was added.
Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Gunnar Morling <gunnar.morling@googlemail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CADGJaX_0+FTguWpNSpgVWYQP_7MhoO0D8=cp4XozSQgaZ40Odw@mail.gmail.com
Backpatch-through: 18
Previously, the idle_replication_slot_timeout parameter used minutes
as its unit, based on the assumption that values would typically exceed
one minute in production environments. However, this caused unexpected
behavior: specifying a value below 30 seconds would round down to 0,
effectively disabling the timeout. This could be surprising to users.
To allow finer-grained control and avoid such confusion, this commit changes
the unit of idle_replication_slot_timeout to seconds. Larger values can
still be specified easily using standard time suffixes, for example,
'24h' for 24 hours.
Back-patch to v18 where idle_replication_slot_timeout was added.
Reported-by: Gunnar Morling <gunnar.morling@googlemail.com>
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CADGJaX_0+FTguWpNSpgVWYQP_7MhoO0D8=cp4XozSQgaZ40Odw@mail.gmail.com
Backpatch-through: 18
When sslkeylogfile has been set but the file fails to open in an
otherwise successful connection, the log entry added to the conn
object is never printed. Instead print the error on stderr for
increased visibility. This is a debugging tool so using stderr
for logging is appropriate. Also while there, remove the umask
call in the callback as it's not useful.
Issues noted by Peter Eisentraut in post-commit review, backpatch
down to 18 when support for sslkeylogfile was added
Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/70450bee-cfaa-48ce-8980-fc7efcfebb03@eisentraut.org
Backpatch-through: 18
Commit a45c78e328 moved large object metadata from SECTION_PRE_DATA
to SECTION_DATA but neglected to move PRIO_LARGE_OBJECT in
dbObjectTypePriorities accordingly. While this hasn't produced any
known live bugs, it causes problems for a proposed patch that
optimizes upgrades with many large objects. Fixing the priority
might also make the topological sort step marginally faster by
reducing the number of ordering violations that have to be fixed.
Reviewed-by: Nitin Motiani <nitinmotiani@google.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/aBkQLSkx1zUJ-LwJ%40nathan
Discussion: https://postgr.es/m/aG_5DBCjdDX6KAoD%40nathan
Backpatch-through: 17
Since b0635bfda, libpq uses dlopen() and related functions. On some
platforms these are not supplied by libc, but by a separate library
libdl, in which case we need to make sure that that dependency is
known to the linker. Meson seems to take care of that automatically,
but the Makefile didn't cater for it.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1328170.1752082586@sss.pgh.pa.us
Backpatch-through: 18
Commit c273d9d8ce reworked tab-completion of COPY and \copy in psql
and added support for completing options within WITH clauses. However,
the same COPY options were suggested for both COPY TO and COPY FROM
commands, even though some options are only valid for one or the
other.
This commit separates the COPY options for COPY FROM and COPY TO
commands to provide more accurate auto-completion suggestions.
Back-patch to v14 where tab-completion for COPY and \copy options
within WITH clauses was first supported.
Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/079e7a2c801f252ae8d522b772790ed7@oss.nttdata.com
Backpatch-through: 14
This routine has been introduced as a shortcut to be able to retrieve a
service name from an active connection, for psql. Per discussion, and
as it is only used by psql, let's remove it to not clutter the libpq API
more than necessary.
The logic in psql is replaced by lookups of PQconninfoOption for the
active connection, instead, updated each time the variables are synced
by psql, the prompt shortcut relying on the variable synced.
Reported-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250706161319.c1.nmisch@google.com
Backpatch-through: 18
xmltotext_with_options() did not consider the possibility that
pg_xml_init() could fail --- most likely due to OOM. If that
happened, the already-parsed xmlDoc structure would be leaked.
Oversight in commit 483bdb2af.
Bug: #18981
Author: Dmitry Kovalenko <d.kovalenko@postgrespro.ru>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18981-9bc3c80f107ae925@postgresql.org
Backpatch-through: 16
Per buildfarm member culicidae, the query checking for stats reported by
the WAL summarizer related to WAL reads is proving to be unstable.
Instead of a one-time query, this commit replaces the logic with a
polling query checking for the WAL read stats, making the test more
reliable on machines that could be slow with the stats reports.
This test has been introduced in f4694e0f35, so backpatch down to v18.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/f35ba3db-fca7-4693-bc35-6db64488e4b1@gmail.com
Backpatch-through: 18
By default io_uring creates a shared memory mapping for each io_uring
instance, leading to a large number of memory mappings. Unfortunately a large
number of memory mappings slows things down, backend exit is particularly
affected. To address that, newer kernels (6.5) support using user-provided
memory for the memory. By putting the relevant memory into shared memory we
don't need any additional mappings.
On a system with a new enough kernel and liburing, there is no discernible
overhead when doing a pgbench -S -C anymore.
Reported-by: MARK CALLAGHAN <mdcallag@gmail.com>
Reviewed-by: "Burd, Greg" <greg@burd.me>
Reviewed-by: Jim Nasby <jnasby@upgrade.com>
Discussion: https://postgr.es/m/CAFbpF8OA44_UG+RYJcWH9WjF7E3GA6gka3gvH6nsrSnEe9H0NA@mail.gmail.com
Backpatch-through: 18
pl/pgsql's notion of an "expression" is very broad, encompassing
any SQL SELECT query that returns a single column and no more than
one row. So there are cases, for example evaluation of an aggregate
function, where the query involves significant work and it'd be useful
to run it with parallel workers. This used to be possible, but
commits 3eea7a0c9 et al unintentionally disabled it.
The simplest fix is to make exec_eval_expr() pass maxtuples = 0
rather than 2 to exec_run_select(). This avoids the new rule that
we will never use parallelism when a nonzero "count" limit is passed
to ExecutorRun(). (Note that the pre-3eea7a0c9 behavior was indeed
unsafe, so reverting that rule is not in the cards.) The reason
for passing 2 before was that exec_eval_expr() will throw an error
if it gets more than one returned row, so we figured that as soon
as we have two rows we know that will happen and we might as well
stop running the query. That choice was cost-free when it was made;
but disabling parallelism is far from cost-free, so now passing 2
amounts to optimizing a failure case at the expense of useful cases.
An expression query that can return more than one row is certainly
broken. People might now need to wait a bit longer to discover such
breakage; but hopefully few will use enormously expensive cases as
their first test of new pl/pgsql logic.
Author: Dipesh Dhameliya <dipeshdhameliya125@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CABgZEgdfbnq9t6xXJnmXbChNTcWFjeM_6nuig41tm327gYi2ig@mail.gmail.com
Backpatch-through: 13
With tables defined like this,
CREATE TABLE ip (id int PRIMARY KEY);
CREATE TABLE ic (id int) INHERITS (ip);
ALTER TABLE ic ALTER id DROP NOT NULL;
pg_upgrade fails during the schema restore phase due to this error:
ERROR: column "id" in child table must be marked NOT NULL
This can only be fixed by marking the child column as NOT NULL before
the upgrade, which could take an arbitrary amount of time (because ic's
data must be scanned). Have pg_upgrade's check mode warn if that
condition is found, so that users know what to adjust before running the
upgrade for real.
Author: Ali Akbar <the.apaan@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/CACQjQLoMsE+1pyLe98pi0KvPG2jQQ94LWJ+PTiLgVRK4B=i_jg@mail.gmail.com