This adjusts cast functions from jsonb to other types to support soft
errors. This just involves some refactoring of the underlying helper
functions to use ereturn.
This is in preparation for a future feature where conversion errors in
casts can be caught.
Author: jian he <jian.universality@gmail.com>
Reviewed-by: Amul Sul <sulamul@gmail.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CADkLM%3Dfv1JfY4Ufa-jcwwNbjQixNViskQ8jZu3Tz_p656i_4hQ%40mail.gmail.com
When a backend attempts to start a read IO and finds the first buffer already
has I/O in progress, previously it waited for that I/O to complete before
initiating reads for any of the subsequent buffers.
Although it must wait for the I/O to finish when acquiring the buffer, there's
no reason for it to wait when setting up the read operation. Waiting at this
point prevents starting I/O on subsequent buffers and can significantly reduce
concurrency.
This matters in two workloads:
1) When multiple backends scan the same relation concurrently.
2) When a single backend requests the same block multiple times within the
readahead distance.
Waiting each time an in-progress read is encountered effectively degenerates
the access pattern into synchronous I/O.
To fix this, when encountering an already in-progress IO for the head buffer,
the wait reference is now recorded and waiting is deferred until
WaitReadBuffers(), when the buffer actually needs to be acquired.
In rare cases, a backend may still need to wait synchronously at IO
start time: If another backend has set BM_IO_IN_PROGRESS on the buffer
but has not yet set the wait reference. Such windows should be brief and
uncommon.
Author: Melanie Plageman <melanieplageman@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/flat/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw%403p3zu522yykv
Until now StartBufferIO() had a few weaknesses:
- As it did not submit staged IOs, it was not safe to call StartBufferIO()
where there was a potential for unsubmitted IO, which required
AsyncReadBuffers() to use a wrapper (ReadBuffersCanStartIO()) around
StartBufferIO().
- With nowait = true, the boolean return value did not allow to distinguish
between no IO being necessary and having to wait, which would lead
ReadBuffersCanStartIO() to unnecessarily submit staged IO.
- Several callers needed to handle both local and shared buffers, requiring
the caller to differentiate between StartBufferIO() and StartLocalBufferIO()
- In a future commit some callers of StartBufferIO() want the BufferDesc's
io_wref to be returned, to asynchronously wait for in-progress IO
- Indicating whether to wait with the nowait parameter was somewhat confusing
compared to a wait parameter
Address these issues as follows:
- StartBufferIO() is renamed to StartSharedBufferIO()
- A new StartBufferIO() is introduced that supports both shared and local
buffers
- The boolean return value has been replaced with an enum, indicating whether
the IO is already done, already in progress or that the buffer has been
readied for IO
- A new PgAioWaitRef * argument allows the caller to get the wait reference is
desired. All current callers pass NULL, a user of this will be introduced
subsequently
- Instead of the nowait argument there now is wait
This probably would not have been worthwhile on its own, but since all these
lines needed to be touched anyway...
Author: Andres Freund <andres@anarazel.de>
Author: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw@3p3zu522yykv
PostmasterContext is not available in single-user mode, use
TopMemoryContext instead. Also make sure that we use the correct
memory context in the lappend().
Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/acb_Eo1XtmCO_9z7@nathan
While we have a lot of indirect coverage of read streams, there are corner
cases that are hard to test when only indirectly controlling and observing the
read stream. This commit adds an SQL callable SRF interface for a read stream
and uses that in a few tests.
To make some of the tests possible, the injection point infrastructure in
test_aio had to be expanded to allow blocking IO completion.
While at it, fix a wrong debug message in inj_io_short_read_hook().
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw@3p3zu522yykv
Create a <sect4> section for each function that the previous text
described in one long series of paragraphs. Also split the functions'
previously in-line syntax summaries into <synopsis> clauses, which is
more readable and allows us to sneak in an explicit mention of the
result data type.
This change gives us an opportunity to make cross-reference links
more specific, too, so do that.
Author: jian he <jian.universality@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxFuk9P=P4=BZ=qCkgvo6im8aL8NnCkjxx2S2MQDWNdouw@mail.gmail.com
Now that the buffer content lock is implemented as part of BufferDesc.state,
releasing the lock and unpinning the buffer can be implemented as a single
atomic operation.
This improves workloads that have heavy contention on a small number of
buffers substantially, I e.g., see a ~20% improvement for pipelined readonly
pgbench on an older two socket machine.
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d
An upcoming commit will make UnlockReleaseBuffer() considerably faster and
more scalable than doing LockBuffer(BUFFER_LOCK_UNLOCK); ReleaseBuffer();. But
it's a small performance benefit even as-is.
Most of the callsites changed in this patch are not performance sensitive,
however some, like the nbtree ones, are in critical paths.
This patch changes all the easily convertible places over to
UnlockReleaseBuffer() mainly because I needed to check all of them anyway, and
reducing cases where the operations are done separately makes the checking
easier.
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d
After the series of preceding commits introducing and using
BufferBeginSetHintBits()/BufferSetHintBits16(), hint bits are not set anymore
while IO is going on. Therefore we do not need to copy pages while they are
being written out anymore.
For the same reason XLogSaveBufferForHint() now does not need to operate on a
copy of the page anymore, but can instead use the normal XLogRegisterBuffer()
mechanism. For that the assertions and comments to XLogRegisterBuffer() had to
be updated to allow share-exclusive locked buffers to be registered.
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d
Not only is this good style, but it dodges some obscure bugs within
pg_bsd_indent. We could try to fix said bugs, but the amount of
effort required seems far out of proportion to the benefit.
Reported-by: Akshay Joshi <akshay.joshi@enterprisedb.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/CANxoLDfca8O5SkeDxB_j6SVNXd+pNKaDmVmEW+2yyicdU8fy0w@mail.gmail.com
While fixing the base32hex UUID sortability test in commit
89210037a0, it turned out that the expected lexicographical order is
only maintained under the C collation (or an equivalent byte-wise
collation). Natural language collations may employ different rules,
breaking the sortability.
This commit updates the documentation to explicitly state that
base32hex is "byte-wise sortable", ensuring users do not fall into the
trap of using natural language collations when querying their encoded
data.
Co-Authored-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAD21AoAwX1D6baSGuQXm0mzPXPWB07kgaoaaahjNHHenbdY24A@mail.gmail.com
Autovacuum workers scan pg_class twice to collect the set of tables
to process. The first pass is for plain relations and materialized
views, and the second is for TOAST tables. When the worker finds a
table to process, it adds it to the end of a list. Later on, it
processes the tables in the same order as the list. This simple
strategy has worked surprisingly well for a long time, but there
have been many discussions over the years about trying to improve
it.
This commit introduces a scoring system that is used to sort the
aforementioned list of tables to process. The idea is to have
autovacuum workers prioritize tables that are furthest beyond their
thresholds (e.g., a table nearing transaction ID wraparound should
be vacuumed first). This prioritization scheme is certainly far
from perfect; there are simply too many possibilities for any
scoring technique to work across all workloads, and the situation
might change significantly between the time we calculate the score
and the time that autovacuum processes it. However, we have
attemped to develop something that is expected to work for a large
portion of workloads with reasonable parameter settings.
The score is calculated as the maximum of the ratios of each of the
table's relevant values to its threshold. For example, if the
number of inserted tuples is 100, and the insert threshold for the
table is 80, the insert score is 1.25. If all other scores are
below that value, the table's score will be 1.25. The other
criteria considered for the score are the table ages (both
relfrozenxid and relminmxid) compared to the corresponding
freeze-max-age setting, the number of update/deleted tuples
compared to the vacuum threshold, and the number of
inserted/updated/deleted tuples compared to the analyze threshold.
Once exception to the previous paragraph is for tables nearing
wraparound, i.e., those that have surpassed the effective failsafe
ages. In that case, the relfrozenxid/relminmxid-based score is
scaled aggressively so that the table has a decent chance of
sorting to the front of the list.
To adjust how strongly each component contributes to the score, the
following parameters can be adjusted from their default of 1.0 to
anywhere between 0.0 and 10.0 (inclusive). Setting all of these to
0.0 restores pre-v19 prioritization behavior:
autovacuum_freeze_score_weight
autovacuum_multixact_freeze_score_weight
autovacuum_vacuum_score_weight
autovacuum_vacuum_insert_score_weight
autovacuum_analyze_score_weight
This is intended to be a baby step towards smarter autovacuum
workers. Possible future improvements include, but are not limited
to, periodic reprioritization, automatic cost limit adjustments,
and better observability (e.g., a system view that shows current
scores). While we do not expect this commit to produce any
earth-shattering improvements, it is arguably a prerequisite for
the aforementioned follow-up changes.
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Reviewed-by: Greg Burd <greg@burd.me>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/aOaAuXREwnPZVISO%40nathan
These tests were intended to be aligned with each other, but
additional tests for virtual generated columns disrupted that
alignment. The test confirming that user-defined types are not
allowed in virtual generated columns has also been moved to the
generated_virtual.sql-specific section.
Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Mutaamba Maasha <maasha@gmail.com>
Reviewed-by: Surya Poondla <s_poondla@apple.com>
Discussion: https://www.postgresql.org/message-id/flat/20250808115142.e9ccb81f35466a9a131a4c55@sraoss.co.jp
The PredicateLockShmemInit function is pretty complicated, and one
source of confusion is that it reuses the same local variable for
sizes of things. Replace the different uses with separate variables
for clarity.
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/113724ab-0028-493f-9605-6e8570f0939f@iki.fi
An element pattern variable may be repeated in the path pattern.
GraphTableParseState maintains a list of all variable names used in
the graph pattern. Add a new variable name to that list only when it
is not present already. This isn't a problem right now, but it could
be in the future.
Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAExHW5tR4O0vjeqTCPr2VB5pYjNYbJgbCBEQf63NtU5Pz1MiOQ%40mail.gmail.com
Adding an implicit empty vertex pattern when a path pattern starts or
ends with an edge pattern or when two consecutive edge patterns appear
in the pattern is not supported right now. Prohibit such path
patterns.
Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Henson Choi <assam258@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/72a23702-6d96-4103-a54b-057c2352e885%2540eisentraut.org
Previously we reused the shmem allocator's ShmemLock to also protect
lwlock.c's shared memory structures. Introduce a separate spinlock for
lwlock.c for the sake of modularity. Now that lwlock.c has its own
shared memory struct (LWLockTranches), this is easy to do.
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/47aaf57e-1b7b-4e12-bda2-0316081ff50e@iki.fi
Merge the LWLockTranches and NamedLWLockTrancheRequest data structures
in shared memory into one array of user-defined tranches. The
NamedLWLockTrancheRequest list is now only used in postmaster, to hold
the requests until shared memory is initialized.
Introduce a C struct, LWLockTranches, to hold all the different fields
kept in shared memory. This gives an easier overview of what are all
the things kept in shared memory. Previously, we had separate pointers
for LWLockTrancheNames, LWLockCounter and the (shared memory copy of)
NamedLWLockTrancheRequestArray.
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/47aaf57e-1b7b-4e12-bda2-0316081ff50e@iki.fi
The "named tranches" term is a little confusing. In most places it
refers to tranches requested with RequestNamedLWLockTranche(), even
though all built-in tranches and tranches allocated with
LWLockNewTrancheId() also have a name. But in MAX_NAMED_TRANCHES, it
refers to tranches requested with either RequestNamedLWLockTranche()
or LWLockNewTrancheId(), as it's the maximum of all of those in total.
The "user defined" term is already used in
LWTRANCHE_FIRST_USER_DEFINED, so let's standardize on that to mean
tranches allocated with either RequestNamedLWLockTranche() or
LWLockNewTrancheId().
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://www.postgresql.org/message-id/47aaf57e-1b7b-4e12-bda2-0316081ff50e@iki.fi
Factor out the "persistence mode" and storage/compression parts
of the syntax synopsis to reduce line lengths and increase
readability. Also add an introductory para about the persistence
modes so that the Description section still lines up with the
synopsis.
Author: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CAKFQuwYfMV-2SdrP-umr5SVNSqTn378BUvHsebetp5=DhT494w@mail.gmail.com
The premise of src/test/modules/test_plan_advice is that if we plan
a query once, generate plan advice, and then replan it using that
same advice, all of that advice should apply cleanly, since the
settings and everything else are the same. Unfortunately, that's
not the case: the test suite is the main regression tests, and
concurrent activity can change the statistics on tables involved
in the query, especially system catalogs. That's OK as long as it
only affects costing, but in a few cases, it affects which relations
appear in the final plan at all.
In the buildfarm failures observed to date, this happens because
we consider alternative subplans for the same portion of the query;
in theory, MinMaxAggPath is vulnerable to a similar hazard. In both
cases, the planner clones an entire subquery, and the clone has a
different plan name, and therefore different range table identifiers,
than the original. If a cost change results in flipping between one
of these plans and the other, the test_plan_advice tests will fail,
because the range table identifiers to which advice was applied won't
even be present in the output of the second planning cycle.
To fix, invent a new DO_NOT_SCAN advice tag. When generating advice,
emit it for relations that should not appear in the final plan at
all, because some alternative version of that relation was used
instead. When DO_NOT_SCAN is supplied, disable all scan methods for
that relation.
To make this work, we reuse a bunch of the machinery that previously
existed for the purpose of ensuring that we build the same set of
relation identifiers during planning as we do from the final
PlannedStmt. In the process, this commit slightly weakens the
cross-check mechanism: before this commit, it would fire whenever
the pg_plan_advice module was loaded, even if pg_plan_advice wasn't
actually doing anything; now, it will only engage when we have some
other reason to create a pgpa_planner_state. The old way was complex
and didn't add much useful test coverage, so this seems like an
acceptable sacrifice.
Discussion: http://postgr.es/m/CA+TgmoYuWmN-00Ec5pY7zAcpSFQUQLbgAdVWGR9kOR-HM-fHrA@mail.gmail.com
Reviewed-by: Lukas Fittl <lukas@fittl.com>
Typically, we have only one PlannerInfo for any given subquery, but
when we are considering a MinMaxAggPath or a hashed subplan, we end
up creating a second PlannerInfo for the same portion of the query,
with a clone of the original range table. In fact, in the MinMaxAggPath
case, we might end up creating several clones, one per aggregate.
At present, there's no easy way for a plugin, such as pg_plan_advice,
to understand the relationships between the original range table and
the copies of it that are created in these cases. To fix, add an
alternative_plan_name field to PlannerInfo. For a hashed subplan, this
is the plan name for the non-hashed alternative; for minmax aggregates,
this is the plan_name from the parent PlannerInfo; otherwise, it's the
same as plan_name.
Discussion: http://postgr.es/m/CA+TgmoYuWmN-00Ec5pY7zAcpSFQUQLbgAdVWGR9kOR-HM-fHrA@mail.gmail.com
Reviewed-by: Lukas Fittl <lukas@fittl.com>
The COMMIT command handles an aborted transaction in the same
manner as the ROLLBACK command, but this wasn't explained in
its official reference page. Also mention that behavior in
the tutorial's material on transactions.
Also add a comment mentioning that we don't raise an exception
for COMMIT within an aborted transaction, as the SQL standard
would have us do.
Hyperlink a couple of cross-references while we're at it.
Author: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Gurjeet Singh <gurjeet@singh.im>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAKFQuwYgYR3rWt6vFXw=ZWZ__bv7PqvdOnHujG+UyqE11f+3sg@mail.gmail.com
Restructure AsyncReadBuffers() to use early return when the head buffer is
already valid, instead of using a did_start_io flag and if/else branches. Also
move around a bit of the code to be located closer to where it is used. This
is a refactor only.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw@3p3zu522yykv
Already two places count buffer hits, requiring quite a few lines of
code since we do accounting in so many places. Future commits will add
more locations, so refactor into a helper.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw@3p3zu522yykv
PinBufferForBlock() is always_inline and called in a loop in
StartReadBuffersImpl(). Previously it computed io_context and io_object
internally, which required calling IOContextForStrategy() -- a non-inline
function the compiler cannot prove is side-effect-free. This could potential
cause unneeded redundant function calls.
Compute io_context and io_object in the callers instead, allowing
StartReadBuffersImpl() to do so once before entering the loop.
Author: Melanie Plageman <melanieplageman@gmail.com>
Suggested-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw@3p3zu522yykv
pg_plan_advice tracks two pieces of per-PlannerInfo data: (1) for each
RTI, the corresponding relation identifier, for purposes of
cross-checking those calculations against the final plan; and (2) the
set of semijoins seen during planning for which the strategy of making
one side unique was considered. The former is tracked using a hash
table that uses <plan_name, RTI> as the key, and the latter is
tracked using a List of <plan_name, relids>.
It seems better to track both of these things in the same way and
to try to reuse some code instead of having everything be completely
separate, so invent pgpa_planner_info; we'll create one every time we
see a new PlannerInfo and need to associate some data with it, and
we'll use the plan_name field to distinguish between PlannerInfo
objects, as it should always be unique. Then, refactor the two
systems mentioned above to use this new infrastructure.
(Note that the adjustment in pgpa_plan_walker is necessary in order
to avoid spuriously triggering the sanity check in that function,
in the case where a pgpa_planner_info is created for a purpose not
related to sj_unique_rels.)
Discussion: https://postgr.es/m/CA+TgmoaK=4w7-qknUo3QhUJ53pXZq=c=KgZmRyD+k7ytqfmgSg@mail.gmail.com
Reviewed-by: Lukas Fittl <lukas@fittl.com>
We recommend looking at psql's "-E" output to help understand the
system catalogs, but in some cases (particularly table displays)
there's a bunch of rather impenetrable SQL there. As a small
improvement, label each query issued by describe.c with a short
description of its purpose. The code is arranged so that the
labels also appear as SQL comments in the server log, if the
server is logging these commands.
We could expand this policy to every use of PSQLexec(), but most of
the ones outside describe.c are issuing simple commands like "BEGIN"
or "COMMIT", which don't seem to need such glosses. I did add
labels to the commands issued by \sf, \sv and friends.
Also, make the -E and log output for hidden queries say
"INTERNAL QUERY" not just "QUERY", to distinguish them from
user-written queries.
Author: Greg Sabino Mullane <htamfids@gmail.com>
Co-authored-by: David Christensen <david+pg@pgguru.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAKAnmmJz8Hh=8Ru8jgzySPWmLBhnv4=oc_0KRiz-UORJ0Dex+w@mail.gmail.com
AsyncReadBuffer()'s no-IO needed path passed
TRACE_POSTGRESQL_BUFFER_READ_DONE the wrong block number because it had
already incremented operation->nblocks_done. Fix by folding the
nblocks_done offset into the blocknum local variable at initialization.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/u73un3xeljr4fiidzwi4ikcr6vm7oqugn4fo5vqpstjio6anl2%40hph6fvdiiria
Backpatch-through: 18
In a future commit more AIO related tests are due to be introduced. However
001_aio.pl already is fairly large.
This commit introduces a new TestAio package with helpers for writing AIO
related tests. Then it uses the new helpers to simplify the existing
001_aio.pl by iterating over all supported io_methods. This will be
particularly helpful because additional methods already have been submitted.
Additionally this commit splits out testing of initdb using a non-default
method into its own test. While that test is somewhat important, it's fairly
slow and doesn't break that often. For development velocity it's helpful for
001_aio.pl to be faster.
While particularly the latter could benefit from being its own commit, it
seems to introduce more back-and-forth than it's worth.
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw@3p3zu522yykv
When my commit e222534679 added the
concept of disabled_nodes, it failed to add a disabled_nodes field
to SubPlan. This is a regression: before that commit, when
fix_alternative_subplan compared the costs of two plans, the number
of disabled nodes affected the result, because it was just a
component of the total cost. After that commit, it no longer did,
making it possible for a disabled path to win on cost over one that
is not disabled. Fix that.
As usual for planner fixes that might destabilize plan choices,
no back-patch.
Discussion: https://postgr.es/m/CA+TgmoaK=4w7-qknUo3QhUJ53pXZq=c=KgZmRyD+k7ytqfmgSg@mail.gmail.com
Reviewed-by: Lukas Fittl <lukas@fittl.com>
This dials back a couple of the qualifiers added by commit
7724cb9935. Specifically, in match_boolean_partition_clause() the
call to negate_clause() casts away the const, so we shouldn't make the
input argument const.
Previously, when the startup process applied WAL and requested walreceiver
to send an apply notification to the primary, walreceiver sent a status reply
unconditionally, even if the WAL locations had not advanced since
the previous update.
As a result, the standby could send two consecutive status reply messages
with identical WAL locations even though wal_receiver_status_interval had
not yet elapsed. This could unexpectedly reset the reported replication lag,
making it difficult for users to monitor lag. The second message was also
unnecessary because it reported no progress.
This commit updates walreceiver to send a reply only when the apply location
has advanced since the last status update, even when the startup process
requests a notification.
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com
pg_stat_replication is documented to keep the last measured lag values for
a short time after the standby catches up, and then set them to NULL when
there is no WAL activity. However, previously lag values could become NULL
prematurely even while WAL activity was ongoing, especially in logical
replication.
This happened because the code cleared lag when two consecutive reply messages
indicated that the apply location had caught up with the send location.
It did not verify that the reported positions were unchanged, so lag could be
cleared even when positions had advanced between messages. In logical
replication, where the apply location often quickly catches up, this issue was
more likely to occur.
This commit fixes the issue by clearing lag only when the standby reports that
it has fully replayed WAL (i.e., both flush and apply locations have caught up
with the send location) and the write/flush/apply positions remain unchanged
across two consecutive reply messages.
The second message with unchanged positions typically results from
wal_receiver_status_interval, so lag values are cleared after that interval
when there is no activity. This avoids showing stale lag data while preventing
premature NULL values.
Even with this fix, lag may rarely become NULL during activity if identical
position reports are sent repeatedly. Eliminating such duplicate messages
would address this fully, but that change is considered too invasive for stable
branches and will be handled in master only later.
Backpatch to all supported branches.
Author: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com
Backpatch-through: 14
The MSVC warning option /w24777 added by commit 2307cfe316 was a
typo, it should have been /w24477. But this option is already enabled
by default in level 1, so we don't need to add it explicitly. So just
remove it.
Compound literals, as used in pg_list.h for list_makeN(), are not a
C++ feature. MSVC doesn't accept these. (GCC and Clang accept them,
but they would warn in -pedantic mode.) Replace with equivalent
inline functions. (These are the only instances of compound literals
used in PostgreSQL header files.)
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://www.postgresql.org/message-id/flat/CAGECzQR21OnnKiZO_1rLWO0-16kg1JBxnVq-wymYW0-_1cUNtg%40mail.gmail.com
Reorder the validation checks in replorigin_session_setup() to provide a
more logical flow. This makes the function easier to follow and ensures
that basic state checks are performed consistently.
Additionally, update an error message to align its phrasing with similar
diagnostics in the replication origin subsystem, improving overall
consistency.
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/e0508305-bc6a-417c-b969-36564d632f9e@iki.fi
Commit 497c1170cb added base32hex encoding support, but its
regression test for UUIDs failed on buildfarm members hippopotamus and
jay using natural language locales (such as cs_CZ). This happened
because those collations may sort characters differently, which breaks
the strict byte-wise lexicographical ordering expected by base32hex
encoding.
This commit fixes the regression tests by explicitly using the C
collation.
Per buildfarm members hippopotamus and jay.
Analyzed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/682417.1774482047@sss.pgh.pa.us