Avoid stale slot access after dropping obsolete synced slots.

drop_local_obsolete_slots() continued to dereference local_slot after
calling ReplicationSlotDropAcquired().  Once the slot is dropped, its
entry in the slot array can be reused by another backend, so later reads
of local_slot->data could observe a different slot's name or database
OID, leading to an incorrect unlock and log message.

Save the slot name and database OID before performing the drop, and use
the saved values for the subsequent UnlockSharedObject() call and the log
message.  While at it, emit the "dropped replication slot" message only
when a slot was actually dropped, rather than unconditionally.

Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Backpatch-through: 17, where it was introduced
Discussion: https://postgr.es/m/TY4PR01MB177184FF9EE916F577E1F554194082@TY4PR01MB17718.jpnprd01.prod.outlook.com
This commit is contained in:
Amit Kapila 2026-06-18 09:50:33 +05:30
parent 850b9218c8
commit bdae2c20e8

View file

@ -541,6 +541,7 @@ drop_local_obsolete_slots(List *remote_slot_list)
/* Drop the local slot if it is not required to be retained. */
if (!local_sync_slot_required(local_slot, remote_slot_list))
{
Oid slot_database = local_slot->data.database;
bool synced_slot;
/*
@ -548,8 +549,8 @@ drop_local_obsolete_slots(List *remote_slot_list)
* ReplicationSlotsDropDBSlots(), trying to drop the same slot
* during a drop-database operation.
*/
LockSharedObject(DatabaseRelationId, local_slot->data.database,
0, AccessShareLock);
LockSharedObject(DatabaseRelationId, slot_database, 0,
AccessShareLock);
/*
* In the small window between getting the slot to drop and
@ -566,23 +567,25 @@ drop_local_obsolete_slots(List *remote_slot_list)
if (synced_slot)
{
NameData slot_name = local_slot->data.name;
/*
* Now acquire and drop the slot. Note we purposely don't
* request logical decoding to be disabled here: since this is
* a standby, which derives its logical decoding state from
* the primary, it would be wrong to do so.
*/
ReplicationSlotAcquire(NameStr(local_slot->data.name), true, false);
ReplicationSlotAcquire(NameStr(slot_name), true, false);
ReplicationSlotDropAcquired(false);
ereport(LOG,
errmsg("dropped replication slot \"%s\" of database with OID %u",
NameStr(slot_name),
slot_database));
}
UnlockSharedObject(DatabaseRelationId, local_slot->data.database,
0, AccessShareLock);
ereport(LOG,
errmsg("dropped replication slot \"%s\" of database with OID %u",
NameStr(local_slot->data.name),
local_slot->data.database));
UnlockSharedObject(DatabaseRelationId, slot_database, 0,
AccessShareLock);
}
}
}