postgresql/src
Heikki Linnakangas a075c84f2c Fix race conditions in newly-added test.
Buildfarm has been failing sporadically on the new test.  I was able to
reproduce this by adding a random 0-10 s delay in the walreceiver, just
before it connects to the primary. There's a race condition where node_3
is promoted before it has fully caught up with node_1, leading to diverged
timelines. When node_1 is later reconfigured as standby following node_3,
it fails to catch up:

LOG:  primary server contains no more WAL on requested timeline 1
LOG:  new timeline 2 forked off current database system timeline 1 before current recovery point 0/30000A0

That's the situation where you'd need to use pg_rewind, but in this case
it happens already when we are just setting up the actual pg_rewind
scenario we want to test, so change the test so that it waits until
node_3 is connected and fully caught up before promoting it, so that you
get a clean, controlled failover.

Also rewrite some of the comments, for clarity. The existing comments
detailed what each step in the test did, but didn't give a good overview
of the situation the steps were trying to create.

For reasons I don't understand, the test setup had to be written slightly
differently in 9.6 and 9.5 than in later versions. The 9.5/9.6 version
needed node 1 to be reinitialized from backup, whereas in later versions
it could be shut down and reconfigured to be a standby. But even 9.5 should
support "clean switchover", where primary makes sure that pending WAL is
replicated to standby on shutdown. It would be nice to figure out what's
going on there, but that's independent of pg_rewind and the scenario that
this test tests.

Discussion: https://www.postgresql.org/message-id/b0a3b95b-82d2-6089-6892-40570f8c5e60%40iki.fi
2020-12-04 18:25:45 +02:00
..
backend Ensure that expandTableLikeClause() re-examines the same table. 2020-12-01 14:02:28 -05:00
bin Fix race conditions in newly-added test. 2020-12-04 18:25:45 +02:00
common Replace use of sys_siglist[] with strsignal(). 2020-07-15 22:05:13 -04:00
fe_utils Fix translation of special characters in psql's LaTeX output modes. 2018-11-26 17:32:51 -05:00
include Ensure that expandTableLikeClause() re-examines the same table. 2020-12-01 14:02:28 -05:00
interfaces Stamp 9.6.20. 2020-11-09 17:32:22 -05:00
makefiles Select CFLAGS_SL at configure time, not in platform-specific Makefiles. 2019-10-21 12:32:36 -04:00
pl Translation updates 2020-11-09 12:47:52 +01:00
port Stamp 9.6.20. 2020-11-09 17:32:22 -05:00
template On macOS, use -isysroot in link steps as well as compile steps. 2020-11-20 00:58:26 -05:00
test Ensure that expandTableLikeClause() re-examines the same table. 2020-12-01 14:02:28 -05:00
timezone Update time zone data files to tzdata release 2020d. 2020-10-22 21:24:23 -04:00
tools Sync our copy of the timezone library with IANA release tzcode2020c. 2020-10-16 21:40:16 -04:00
tutorial Update copyright for 2016 2016-01-02 13:33:40 -05:00
.gitignore
bcc32.mak Autoconfiscate selection of 64-bit int type for 64-bit large object API. 2012-10-07 21:52:43 -04:00
DEVELOPERS
Makefile Install TAP test infrastructure so it's available for extension testing. 2016-09-23 15:50:00 -04:00
Makefile.global.in Select CFLAGS_SL at configure time, not in platform-specific Makefiles. 2019-10-21 12:32:36 -04:00
Makefile.shlib Ensure static libraries have correct mod time even if ranlib messes it up. 2018-11-29 15:53:44 -05:00
nls-global.mk nls-global.mk: search build dir for source files, too 2016-06-07 18:55:18 -04:00
win32.mak Autoconfiscate selection of 64-bit int type for 64-bit large object API. 2012-10-07 21:52:43 -04:00