postgresql/src
Tomas Vondra fb2f8e534a Fix ordering of XIDs in ProcArrayApplyRecoveryInfo
Commit 8431e296ea reworked ProcArrayApplyRecoveryInfo to sort XIDs
before adding them to KnownAssignedXids. But the XIDs are sorted using
xidComparator, which compares the XIDs simply as uint32 values, not
logically. KnownAssignedXidsAdd() however expects XIDs in logical order,
and calls TransactionIdFollowsOrEquals() to enforce that. If there are
XIDs for which the two orderings disagree, an error is raised and the
recovery fails/restarts.

Hitting this issue is fairly easy - you just need two transactions, one
started before the 4B limit (e.g. XID 4294967290), the other sometime
after it (e.g. XID 1000). Logically (4294967290 <= 1000) but when
compared using xidComparator we try to add them in the opposite order.
Which makes KnownAssignedXidsAdd() fail with an error like this:

  ERROR: out-of-order XID insertion in KnownAssignedXids

This only happens during replica startup, while processing RUNNING_XACTS
records to build the snapshot. Once we reach STANDBY_SNAPSHOT_READY, we
skip these records. So this does not affect already running replicas,
but if you restart (or create) a replica while there are transactions
with XIDs for which the two orderings disagree, you may hit this.

Long-running transactions and frequent replica restarts increase the
likelihood of hitting this issue. Once the replica gets into this state,
it can't be started (even if the old transactions are terminated).

Fixed by sorting the XIDs logically - this is fine because we're dealing
with normal XIDs (because it's XIDs assigned to backends) and from the
same wraparound epoch (otherwise the backends could not be running at
the same time on the primary node). So there are no problems with the
triangle inequality, which is why xidComparator compares raw values.

Investigation and root cause analysis by Abhijit Menon-Sen. Patch by me.

This issue is present in all releases since 9.4, however releases up to
9.6 are EOL already so backpatch to 10 only.

Reviewed-by: Abhijit Menon-Sen
Reviewed-by: Alvaro Herrera
Backpatch-through: 10
Discussion: https://postgr.es/m/36b8a501-5d73-277c-4972-f58a4dce088a%40enterprisedb.com
2022-01-27 20:15:37 +01:00
..
backend Fix ordering of XIDs in ProcArrayApplyRecoveryInfo 2022-01-27 20:15:37 +01:00
bin pg_dump: avoid useless query in binary_upgrade_set_type_oids_by_type_oid 2022-01-23 13:54:24 -05:00
common Revert error handling improvements for cryptohashes 2022-01-14 11:25:39 +09:00
fe_utils Avoid calling gettext() in signal handlers. 2022-01-17 13:30:04 -05:00
include Fix ordering of XIDs in ProcArrayApplyRecoveryInfo 2022-01-27 20:15:37 +01:00
interfaces Fix race condition in gettext() initialization in libpq and ecpglib. 2022-01-21 15:36:28 -05:00
makefiles Add NO_INSTALL option to pgxs 2021-05-27 13:58:29 +02:00
pl Remove unneeded Python includes 2021-11-25 14:30:12 +01:00
port Fix compatibility thinko for fstat() on standard streams in win32stat.c 2021-11-30 09:55:56 +09:00
template Further tweaking of PG_SYSROOT heuristics for macOS. 2021-01-20 12:07:23 -05:00
test Improve msys2 detection for TAP tests 2022-01-27 08:26:28 -05:00
timezone Update time zone data files to tzdata release 2021e. 2021-10-29 11:38:32 -04:00
tools Allow clean.bat to be run from anywhere 2022-01-20 10:20:40 -05:00
tutorial doc: Prefer explicit JOIN syntax over old implicit syntax in tutorial 2021-04-08 10:51:26 +02:00
.gitignore Convert cvsignore to gitignore, and add .gitignore for build targets. 2010-09-22 12:57:04 +02:00
DEVELOPERS Replace a couple of references to files that no longer exist in the source 2009-05-04 08:08:47 +00:00
Makefile Remove the option to build thread_test.c outside configure. 2020-10-21 12:08:48 -04:00
Makefile.global.in Add module build directory to the PATH for TAP tests 2021-10-22 09:50:16 -04:00
Makefile.shlib AIX: Fix missing libpq symbols by respecting SHLIB_EXPORTS. 2021-09-06 11:28:02 -07:00
nls-global.mk Add errhint_plural() function and make use of it 2021-03-31 09:16:25 +02:00