postgresql/src
Tom Lane 787cd2b7d5 Don't treat EINVAL from semget() as a hard failure.
It turns out that on some platforms (at least current macOS, NetBSD,
OpenBSD) semget(2) will return EINVAL if there is a pre-existing
semaphore set with the same key and too few semaphores.  Our code
expects EEXIST in that case and treats EINVAL as a hard failure,
resulting in failure during initdb or postmaster start.

POSIX does document EINVAL for too-few-semaphores-in-set, and is
silent on its priority relative to EEXIST, so this behavior arguably
conforms to spec.  Nonetheless it's quite problematic because EINVAL
is also documented to mean that nsems is greater than the system's
limit on the number of semaphores per set (SEMMSL).  If that is
where the problem lies, retrying would just become an infinite loop.

To resolve this contradiction, retry after EINVAL, but also install a
loop limit that will make us give up regardless of the specific errno
after trying 1000 different keys.  (1000 is a pretty arbitrary number,
but it seems like it should be sufficient.)  I like this better than
the previous infinite-looping behavior, since it will also keep us out
of trouble if (say) we get EACCES due to a system-level permissions
problem rather than anything to do with a specific semaphore set.

This problem has only been observed in the field in PG 17, which uses
a higher nsems value than other branches (cf. 38da05346, 810a8b1c8).
That makes it possible to get the failure if a new v17 postmaster
has a key collision with an existing postmaster of another branch.
In principle though, we might see such a collision against a semaphore
set created by some other application, in which case all branches are
vulnerable on these platforms.  Hence, backpatch.

Reported-by: Gavin Panella <gavinpanella@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CALL7chmzY3eXHA7zHnODUVGZLSvK3wYCSP0RmcDFHJY8f28Q3g@mail.gmail.com
Backpatch-through: 13
2025-08-13 11:59:47 -04:00
..
backend Don't treat EINVAL from semget() as a hard failure. 2025-08-13 11:59:47 -04:00
bin Restrict psql meta-commands in plain-text dumps. 2025-08-11 09:00:00 -05:00
common Don't put library-supplied -L/-I switches before user-supplied ones. 2025-07-29 15:17:40 -04:00
fe_utils Fix bug in archive streamer with LZ4 decompression 2025-07-02 13:48:41 +09:00
include Fix security checks in selectivity estimation functions. 2025-08-11 09:07:36 +01:00
interfaces Translation updates 2025-08-11 14:38:06 +02:00
makefiles pgxs.mk: remove unreachable rule for deleting regress.def. 2025-06-20 12:12:29 -04:00
pl Translation updates 2025-08-11 14:38:06 +02:00
port Fix indentation in pg_numa code 2025-07-01 15:24:19 +02:00
template thread-safety: gmtime_r(), localtime_r() 2024-08-23 07:43:04 +02:00
test Restrict psql meta-commands in plain-text dumps. 2025-08-11 09:00:00 -05:00
timezone Update time zone data files to tzdata release 2025b. 2025-04-30 11:13:49 -04:00
tools aio: Regularize IO worker internal naming. 2025-07-12 14:45:34 +12:00
tutorial Doc: simplify the tutorial's window-function examples. 2025-01-21 14:43:21 -05:00
.gitignore
DEVELOPERS
Makefile Remove distprep 2023-11-06 15:18:04 +01:00
Makefile.global.in Don't put library-supplied -L/-I switches before user-supplied ones. 2025-07-29 15:17:40 -04:00
Makefile.shlib Use exported symbols list on macOS for loadable modules as well 2025-06-10 07:04:43 +02:00
meson.build Update copyright for 2025 2025-01-01 11:21:55 -05:00
nls-global.mk Remove distprep 2023-11-06 15:18:04 +01:00