postgresql/src/include/storage
Fujii Masao e837718804 Detect the deadlocks between backends and the startup process.
The deadlocks that the recovery conflict on lock is involved in can
happen between hot-standby backends and the startup process.
If a backend takes an access exclusive lock on the table and which
finally triggers the deadlock, that deadlock can be detected
as expected. On the other hand, previously, if the startup process
took an access exclusive lock and which finally triggered the deadlock,
that deadlock could not be detected and could remain even after
deadlock_timeout passed. This is a bug.

The cause of this bug was that the code for handling the recovery
conflict on lock didn't take care of deadlock case at all. It assumed
that deadlocks involving the startup process and backends were able
to be detected by the deadlock detector invoked within backends.
But this assumption was incorrect. The startup process also should
have invoked the deadlock detector if necessary.

To fix this bug, this commit makes the startup process invoke
the deadlock detector if deadlock_timeout is reached while handling
the recovery conflict on lock. Specifically, in that case, the startup
process requests all the backends holding the conflicting locks to
check themselves for deadlocks.

Back-patch to v9.6. v9.5 has also this bug, but per discussion we decided
not to back-patch the fix to v9.5. Because v9.5 doesn't have some
infrastructure codes (e.g., 37c54863cf) that this bug fix patch depends on.
We can apply those codes for the back-patch, but since the next minor
version release is the final one for v9.5, it's risky to do that. If we
unexpectedly introduce new bug to v9.5 by the back-patch, there is no
chance to fix that. We determined that the back-patch to v9.5 would give
more risk than gain.

Author: Fujii Masao
Reviewed-by: Bertrand Drouvot, Masahiko Sawada, Kyotaro Horiguchi
Discussion: https://postgr.es/m/4041d6b6-cf24-a120-36fa-1294220f8243@oss.nttdata.com
2021-01-06 12:31:55 +09:00
..
.gitignore When trace_lwlocks is used, identify individual lwlocks by name. 2015-09-11 14:01:39 -04:00
backendid.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
barrier.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
block.h Assorted minor doc/comment fixes. 2018-04-28 11:46:15 -04:00
buf.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
buf_internals.h C comment: correct use of 64-"byte" cache line size 2020-09-04 13:27:52 -04:00
buffile.h Use 64 bit type for BufFileSize(). 2018-11-15 13:40:06 +13:00
bufmgr.h Revert "Skip WAL for new relfilenodes, under wal_level=minimal." 2020-03-22 09:24:13 -07:00
bufpage.h Extend PageIsVerified() to handle more custom options 2020-11-02 10:41:34 +09:00
checksum.h Revert "Allow on-line enabling and disabling of data checksums" 2018-04-09 19:03:42 +02:00
checksum_impl.h Make checksum_impl.h safe to compile with -fstrict-aliasing. 2018-08-31 12:26:37 -04:00
condition_variable.h Allow ConditionVariable[PrepareTo]Sleep to auto-switch between CVs. 2018-01-09 11:39:10 -05:00
copydir.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
dsm.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
dsm_impl.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
fd.h PANIC on fsync() failure. 2018-11-19 13:37:59 +13:00
freespace.h Remove UpdateFreeSpaceMap(), use FreeSpaceMapVacuumRange() instead. 2018-03-29 12:22:44 -04:00
fsm_internals.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
indexfsm.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
ipc.h Consistently test for in-use shared memory. 2019-04-12 22:36:42 -07:00
item.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
itemid.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
itemptr.h Improve representation of 'moved partitions' indicator on deleted tuples. 2018-05-01 13:30:12 -07:00
large_object.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
latch.h Fix misc typos, mostly in comments. 2018-07-18 16:17:42 +03:00
lmgr.h Prevent concurrent SimpleLruTruncate() for any given SLRU. 2020-08-15 10:15:57 -07:00
lock.h Move new LOCKTAG_DATABASE_FROZEN_IDS to end of enum LockTagType. 2020-08-15 16:16:15 -07:00
lockdefs.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
lwlock.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
off.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
pg_sema.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
pg_shmem.h Consistently test for in-use shared memory. 2019-04-12 22:36:42 -07:00
pmsignal.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
predicate.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
predicate_internals.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
proc.h Add GUC variables for stat tracking and timeout as PGDLLIMPORT 2020-01-21 13:47:01 +09:00
procarray.h Detect the deadlocks between backends and the startup process. 2021-01-06 12:31:55 +09:00
proclist.h Improve error detection capability in proclists. 2018-01-08 18:07:04 -05:00
proclist_types.h Improve error detection capability in proclists. 2018-01-08 18:07:04 -05:00
procsignal.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
reinit.h Post-feature-freeze pgindent run. 2018-04-26 14:47:16 -04:00
relfilenode.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
s_lock.h For PowerPC instruction "addi", use constraint "b". 2019-10-18 20:20:32 -07:00
sharedfileset.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
shm_mq.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
shm_toc.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
shmem.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
sinval.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
sinvaladt.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
smgr.h Revert "Skip WAL for new relfilenodes, under wal_level=minimal." 2020-03-22 09:24:13 -07:00
spin.h Update copyright for 2018 2018-01-02 23:30:12 -05:00
standby.h Remove AELs from subxids correctly on standby 2018-06-16 14:03:29 +01:00
standbydefs.h Fix bugs in vacuum of shared rels, by keeping their relcache entries current. 2018-06-12 11:13:21 -07:00