We allow INSERT and UPDATE commands to assign to the same column more than
once, as long as the assignments are to subfields or elements rather than
the whole column. However, this failed when the target column was a domain
over array rather than plain array. Fix by teaching process_matched_tle()
to look through CoerceToDomain nodes, and add relevant test cases.
Also add a group of test cases exercising domains over array of composite.
It's doubtless accidental that CREATE DOMAIN allows this case while not
allowing straight domain over composite; but it does, so we'd better make
sure we don't break it. (I could not find any documentation mentioning
either side of that, so no doc changes.)
It's been like this for a long time, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/4206.1499798337@sss.pgh.pa.us
We've heard occasional reports of backend launch failing because
pgwin32_ReserveSharedMemoryRegion() fails, indicating that something
has already used that address space in the child process. It's not
very clear what, given that we disable ASLR in Windows builds, but
suspicion falls on antivirus products. It'd be better if we didn't
have to disable ASLR, anyway. So let's try to ameliorate the problem
by retrying the process launch after such a failure, up to 100 times.
Patch by me, based on previous work by Amit Kapila and others.
This is a longstanding issue, so back-patch to all supported branches.
Discussion: https://postgr.es/m/CAA4eK1+R6hSx6t_yvwtx+NRzneVp+MRqXAdGJZChcau8Uij-8g@mail.gmail.com
Buildfarm evidence shows that TCP_KEEPALIVE_THRESHOLD doesn't exist
after all on Solaris < 11. This means we need to take positive action to
prevent the TCP_KEEPALIVE code path from being taken on that platform.
I've chosen to limit it with "&& defined(__darwin__)", since it's unclear
that anyone else would follow Apple's precedent of spelling the symbol
that way.
Also, follow a suggestion from Michael Paquier of eliminating code
duplication by defining a couple of intermediate symbols for the
socket option.
In passing, make some effort to reduce the number of translatable messages
by replacing "setsockopt(foo) failed" with "setsockopt(%s) failed", etc,
throughout the affected files. And update relevant documentation so
that it doesn't claim to provide an exhaustive list of the possible
socket option names.
Like the previous commit (f0256c774), back-patch to all supported branches.
Discussion: https://postgr.es/m/20170627163757.25161.528@wrigleys.postgresql.org
Turns out that the socket option for this is named TCP_KEEPALIVE_THRESHOLD,
at least according to the tcp(7P) man page for Solaris 11. (But since that
text refers to "SunOS", it's likely pretty ancient.) It appears that the
symbol TCP_KEEPALIVE does get defined on that platform, but it doesn't
seem to represent a valid protocol-level socket option. This leads to
bleats in the postmaster log, and no tcp_keepalives_idle functionality.
Per bug #14720 from Andrey Lizenko, as well as an earlier report from
Dhiraj Chawla that nobody had followed up on. The issue's been there
since we added the TCP_KEEPALIVE code path in commit 5acd417c8, so
back-patch to all supported branches.
Discussion: https://postgr.es/m/20170627163757.25161.528@wrigleys.postgresql.org
check_agg_arguments_walker threw an error upon seeing a SRF or window
function, but that is too aggressive: if the function is within a
sub-select then it's perfectly fine. I broke the SRF case in commit
0436f6bde by copying the logic for window functions ... but that was
broken too, and had been since commit eaccfded9.
Repair both cases in HEAD, and the window function case back to 9.3.
9.2 gets this right.
When a walreceiver dies, the startup process will notice that and send
a PMSIGNAL_START_WALRECEIVER signal to the postmaster, asking for a new
walreceiver to be launched. There's a race condition, which at least
in HEAD is very easy to hit, whereby the postmaster might see that
signal before it processes the SIGCHLD from the walreceiver process.
In that situation, sigusr1_handler() just dropped the start request
on the floor, reasoning that it must be redundant. Eventually, after
10 seconds (WALRCV_STARTUP_TIMEOUT), the startup process would make a
fresh request --- but that's a long time if the connection could have
been re-established almost immediately.
Fix it by setting a state flag inside the postmaster that we won't
clear until we do launch a walreceiver. In cases where that results
in an extra walreceiver launch, it's up to the walreceiver to realize
it's unwanted and go away --- but we have, and need, that logic anyway
for the opposite race case.
I came across this through investigating unexpected delays in the
src/test/recovery TAP tests: it manifests there in test cases where
a master server is stopped and restarted while leaving streaming
slaves active.
This logic has been broken all along, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/21344.1498494720@sss.pgh.pa.us
The stats collector disregards inquiry messages that bear a cutoff_time
before when it last wrote the relevant stats file. That's fine, but at
startup when it reads the "permanent" stats files, it absorbed their
timestamps as if they were the times at which the corresponding temporary
stats files had been written. In reality, of course, there's no data
out there at all. This led to disregarding inquiry messages soon after
startup if the postmaster had been shut down and restarted within less
than PGSTAT_STAT_INTERVAL; which is a pretty common scenario, both for
testing and in the field. Requesting backends would hang for 10 seconds
and then report failure to read statistics, unless they got bailed out
by some other backend coming along and making a newer request within
that interval.
I came across this through investigating unexpected delays in the
src/test/recovery TAP tests: it manifests there because the autovacuum
launcher hangs for 10 seconds when it can't get statistics at startup,
thus preventing a second shutdown from occurring promptly. We might
want to do some things in the autovac code to make it less prone to
getting stuck that way, but this change is a good bug fix regardless.
In passing, also fix pgstat_read_statsfiles() to ensure that it
re-zeroes its global stats variables if they are corrupted by a
short read from the stats file. (Other reads in that function
go into temp variables, so that the issue doesn't arise.)
This has been broken since we created the separation between permanent
and temporary stats files in 8.4, so back-patch to all supported branches.
Discussion: https://postgr.es/m/16860.1498442626@sss.pgh.pa.us
When promoting a standby just after a XLOG_SWITCH record was replayed,
and next segment(s) are already are locally available (via walsender,
restore_command + trigger/recovery target), that segment could
accidentally be recycled onto the past of the new timeline. Later
checkpointer would create a .ready file for it, assuming there was an
error during creation, and it would get archived. That causes trouble
if another standby is later brought up from a basebackup from before
the timeline creation, because it would try to read the
segment, because XLogFileReadAnyTLI just tries all possible timelines,
which doesn't have valid contents. Thus replay would fail.
The problem, if already occurred, can be fixed by removing the segment
and/or having restore_command filter it out.
The reason for the creation of such "phantom" segments was, that after
an XLOG_SWITCH record the EndOfLog variable points to the beginning of
the next segment, and RemoveXlogFile() used XLByteToPrevSeg().
Normally RemoveXlogFile() doing so is harmless, because the last
segment will still exist preventing InstallXLogFileSegment() from
causing harm, but just after promotion there's no previous segment on
the new timeline.
Fix that by using XLByteToSeg() instead of XLByteToPrevSeg().
Author: Andres Freund
Reported-By: Greg Burek
Discussion: https://postgr.es/m/20170619073026.zcwpe6mydsaz5ygd@alap3.anarazel.de
Backpatch: 9.2-, bug older than all supported versions
When a new base type is created using the old-style procedure of first
creating the input/output functions with "opaque" in place of the base
type, the "opaque" argument/return type is changed to the final base type,
on CREATE TYPE. However, we did not create a pg_depend record when doing
that, so the functions were left not depending on the type.
Fixes bug #14706, reported by Karen Huddleston.
Discussion: https://www.postgresql.org/message-id/20170614232259.1424.82774@wrigleys.postgresql.org
We had three occurrences of essentially the same coding pattern
wherein we tried to retrieve a query result from a libpq connection
without blocking. In the case where PQconsumeInput failed (typically
indicating a lost connection), all three loops simply gave up and
returned, forgetting to clear any previously-collected PGresult
object. Since those are malloc'd not palloc'd, the oversight results
in a process-lifespan memory leak.
One instance, in libpqwalreceiver, is of little significance because
the walreceiver process would just quit anyway if its connection fails.
But we might as well fix it.
The other two instances, in postgres_fdw, are somewhat more worrisome
because at least in principle the scenario could be repeated, allowing
the amount of memory leaked to build up to something worth worrying
about. Moreover, in these cases the loops contain CHECK_FOR_INTERRUPTS
calls, as well as other calls that could potentially elog(ERROR),
providing another way to exit without having cleared the PGresult.
Here we need to add PG_TRY logic similar to what exists in quite a
few other places in postgres_fdw.
Coverity noted the libpqwalreceiver bug; I found the other two cases
by checking all calls of PQconsumeInput.
Back-patch to all supported versions as appropriate (9.2 lacks
postgres_fdw, so this is really quite unexciting for that branch).
Discussion: https://postgr.es/m/22620.1497486981@sss.pgh.pa.us
Because walsender and normal backends share the same main loop it's
problematic to have two different flag variables, set in signal
handlers, indicating a pending configuration reload. Only certain
walsender commands reach code paths checking for the
variable (START_[LOGICAL_]REPLICATION, CREATE_REPLICATION_SLOT
... LOGICAL, notably not base backups).
This is a bug present since the introduction of walsender, but has
gotten worse in releases since then which allow walsender to do more.
A later patch, not slated for v10, will similarly unify SIGHUP
handling in other types of processes as well.
Author: Petr Jelinek, Andres Freund
Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20170423235941.qosiuoyqprq4nu7v@alap3.anarazel.de
Backpatch: 9.2-, bug is present since 9.0
The NumericOnly grammar production accepted ICONST, + ICONST, - ICONST,
FCONST, and - FCONST, but for some reason not + FCONST. This led to
strange inconsistencies like
regression=# set random_page_cost = +4;
SET
regression=# set random_page_cost = 4000000000;
SET
regression=# set random_page_cost = +4000000000;
ERROR: syntax error at or near "4000000000"
(because 4000000000 is too large to be an ICONST). While there's
no actual functional reason to need to write a "+", if we allow
it for integers it seems like we should allow it for numerics too.
It's been like that forever, so back-patch to all supported branches.
Discussion: https://postgr.es/m/30908.1496006184@sss.pgh.pa.us
Commit 9aa3c782c added code to allow CREATE TABLE/CREATE TYPE to not fail
when the desired type name conflicts with an autogenerated array type, by
dint of renaming the array type out of the way. But I (tgl) overlooked
that the same case arises in ALTER TABLE/TYPE RENAME. Fix that too.
Back-patch to all supported branches.
Report and patch by Vik Fearing, modified a bit by me
Discussion: https://postgr.es/m/0f4ade49-4f0b-a9a3-c120-7589f01d1eb8@2ndquadrant.com
This patch replaces isspace() calls with scanner_isspace() in functions
that are likely to be presented with non-ASCII input. isspace() has
the small advantage that it will correctly recognize no-break space
in single-byte encodings (such as LATIN1); but it cannot work successfully
for any multibyte character, and depending on platform it might return
false positive results for some fragments of multibyte characters. That's
disastrous for functions that are trying to discard whitespace between
valid strings, as noted in bug #14662 from Justin Muise. Even treating
no-break space as whitespace is pretty questionable for the usages touched
here, because the core scanner would think it is an identifier character.
Affected functions are parse_ident(), parseNameAndArgTypes (underlying
regprocedurein() and siblings), SplitIdentifierString (used for parsing
GUCs and options that are qualified names or lists of names), and
SplitDirectoriesString (used for parsing GUCs that are lists of
directories).
All the functions adjusted here are parsing SQL identifiers and similar
constructs, so it's reasonable to insist that their definition of
whitespace match the core scanner. So we can hope that this won't cause
many backwards-compatibility problems. I've left alone isspace() calls
in places that aren't really expecting any non-ASCII input characters,
such as float8in().
Back-patch to all supported branches.
Discussion: https://postgr.es/m/10129.1495302480@sss.pgh.pa.us
The cash_div_intX functions applied rint() to the result of the division.
That's not merely useless (because the result is already an integer) but
it causes precision loss for values larger than 2^52 or so, because of
the forced conversion to float8.
On the other hand, the cash_mul_fltX functions neglected to apply rint() to
their multiplication results, thus possibly causing off-by-one outputs.
Per C standard, arithmetic between any integral value and a float value is
performed in float format. Thus, cash_mul_flt4 and cash_div_flt4 produced
answers good to only about six digits, even when the float value is exact.
We can improve matters noticeably by widening the float inputs to double.
(It's tempting to consider using "long double" arithmetic if available,
but that's probably too much of a stretch for a back-patched fix.)
Also, document that cash_div_intX operators truncate rather than round.
Per bug #14663 from Richard Pistole. Back-patch to all supported branches.
Discussion: https://postgr.es/m/22403.1495223615@sss.pgh.pa.us
Values in a STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM slot are float8,
not of the type of the column the statistics are for.
This bug is at least partly the fault of sloppy specification comments
for get_attstatsslot()/free_attstatsslot(): the type OID they want is that
of the stavalues entries, not of the underlying column. (I double-checked
other callers and they seem to get this right.) Adjust the comments to be
more correct.
Per buildfarm.
Security: CVE-2017-7484
Both views replace the umoptions field with NULL when the user does not
meet qualifications to see it. They used different qualifications, and
pg_user_mappings documented qualifications did not match its implemented
qualifications. Make its documentation and implementation match those
of user_mapping_options. One might argue for stronger qualifications,
but these have long, documented tenure. pg_user_mappings has always
exhibited this problem, so back-patch to 9.2 (all supported versions).
Michael Paquier and Feike Steenbergen. Reviewed by Jeff Janes.
Reported by Andrew Wheelwright.
Security: CVE-2017-7486
Some selectivity estimation functions run user-supplied operators over
data obtained from pg_statistic without security checks, which allows
those operators to leak pg_statistic data without having privileges on
the underlying tables. Fix by checking that one of the following is
satisfied: (1) the user has table or column privileges on the table
underlying the pg_statistic data, or (2) the function implementing the
user-supplied operator is leak-proof. If neither is satisfied, planning
will proceed as if there are no statistics available.
At least one of these is satisfied in most cases in practice. The only
situations that are negatively impacted are user-defined or
not-leak-proof operators on a security-barrier view.
Reported-by: Robert Haas <robertmhaas@gmail.com>
Author: Peter Eisentraut <peter_e@gmx.net>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Security: CVE-2017-7484
It only produced <row> elements but no wrapping <table> element.
By contrast, cursor_to_xmlschema produced a schema that is now correct
but did not previously match the XML data produced by cursor_to_xml.
In passing, also fix a minor misunderstanding about moving cursors in
the tests related to this.
Reported-by: filip@jirsak.org
Based-on-patch-by: Thomas Munro <thomas.munro@enterprisedb.com>
Due to a missing CommandCounterIncrement() call, parsing of a non-utility
command in an extension script would not see the effects of the immediately
preceding DDL command, unless that command's execution ends with
CommandCounterIncrement() internally ... which some do but many don't.
Report by Philippe Beaudoin, diagnosis by Julien Rouhaud.
Rather remarkably, this bug has evaded detection since extensions were
invented, so back-patch to all supported branches.
Discussion: https://postgr.es/m/2cf7941e-4e41-7714-3de8-37b1a8f74dff@free.fr
Currently, trying to validate a NO INHERIT constraint on the parent will
search for the constraint in child tables (where it is not supposed to
exist), wrongly causing a "constraint does not exist" error.
Amit Langote, per a report from Hans Buschmann.
Discussion: http://postgr.es/m/20170421184012.24362.19@wrigleys.postgresql.org
ProcessTwoPhaseBuffer (formerly StandbyRecoverPreparedTransactions)
mixed up the parent and child XIDs when calling SubTransSetParent to
record the transactions' relationship in pg_subtrans.
Remarkably, analysis by Simon Riggs suggests that this doesn't lead to
visible problems (at least, not in non-Assert builds). That might
explain why we'd not noticed it before. Nonetheless, it's surely wrong.
This code was born broken, so back-patch to all supported branches.
Discussion: https://postgr.es/m/20110.1492905318@sss.pgh.pa.us
The POSIX standard does not say that the success return value for
fcntl(F_SETFD) and fcntl(F_SETFL) is zero; it says only that it's not -1.
We had several calls that were making the stronger assumption. Adjust
them to test specifically for -1 for strict spec compliance.
The standard further leaves open the possibility that the O_NONBLOCK
flag bit is not the only active one in F_SETFL's argument. Formally,
therefore, one ought to get the current flags with F_GETFL and store
them back with only the O_NONBLOCK bit changed when trying to change
the nonblock state. In port/noblock.c, we were doing the full pushup
in pg_set_block but not in pg_set_noblock, which is just weird. Make
both of them do it properly, since they have little business making
any assumptions about the socket they're handed. The other places
where we're issuing F_SETFL are working with FDs we just got from
pipe(2), so it's reasonable to assume the FDs' properties are all
default, so I didn't bother adding F_GETFL steps there.
Also, while pg_set_block deserves some points for trying to do things
right, somebody had decided that it'd be even better to cast fcntl's
third argument to "long". Which is completely loony, because POSIX
clearly says the third argument for an F_SETFL call is "int".
Given the lack of field complaints, these missteps apparently are not
of significance on any common platforms. But they're still wrong,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/30882.1492800880@sss.pgh.pa.us
This back-patches 9.4 commits 31cf1a1a4, 86029b31e, and 36a3be654 into
the prior branches, along with relevant bits of b1aebbb6a and 7ce2a45ae.
We had foreseen doing this once the code was proven, but that never did
happen, probably because we got sufficiently fed up with renegotiation
to disable it by default. However, we have to do something now because
the prior code doesn't even compile against OpenSSL 1.1. Per discussion,
the best solution seems to be to make the older branches look like 9.4.
Discussion: https://postgr.es/m/20047.1492305247@sss.pgh.pa.us
In standard non-Windows builds, there's no particular reason to care what
address the kernel chooses to map the shared memory segment at. However,
when building with EXEC_BACKEND, there's a risk that the chosen address
won't be available in all child processes. Linux with ASLR enabled (which
it is by default) seems particularly at risk because it puts shmem segments
into the same area where it maps shared libraries. We can work around
that by specifying a mapping address that's outside the range where
shared libraries could get mapped. On x86_64 Linux, 0x7e0000000000
seems to work well.
This is only meant for testing/debugging purposes, so it doesn't seem
necessary to go as far as providing a GUC (or any user-visible
documentation, though we might change that later). Instead, it's just
controlled by setting an environment variable PG_SHMEM_ADDR to the
desired attach address.
Back-patch to all supported branches, since the point here is to
remove intermittent buildfarm failures on EXEC_BACKEND animals.
Owners of affected animals will need to add a suitable setting of
PG_SHMEM_ADDR to their build_env configuration.
Discussion: https://postgr.es/m/7036.1492231361@sss.pgh.pa.us
regexport.c thought it could just ignore LACON arcs, but the correct
behavior is to treat them as satisfiable while consuming zero input
(rather reminiscently of commit 9f1e642d5). Otherwise, the emitted
simplified-NFA representation may contain no paths leading from initial
to final state, which unsurprisingly confuses pg_trgm, as seen in
bug #14623 from Jeff Janes.
Since regexport's output representation has no concept of an arc that
consumes zero input, recurse internally to find the next normal arc(s)
after any LACON transitions. We'd be forced into changing that
representation if a LACON could be the last arc reaching the final
state, but fortunately the regex library never builds NFAs with such
a configuration, so there always is a next normal arc.
Back-patch to 9.3 where this logic was introduced.
Discussion: https://postgr.es/m/20170413180503.25948.94871@wrigleys.postgresql.org
HandleFunctionRequest() is no longer responsible for reading the protocol
message from the client, since commit 2b3a8b20c2. Fix the outdated
comments.
HandleFunctionRequest() now always returns 0, because the code that used
to return EOF was moved in 2b3a8b20c2. Therefore, the caller no longer
needs to check the return value.
Reported by Andres Freund. Backpatch to all supported versions, even though
this doesn't have any user-visible effect, to make backporting future
patches in this area easier.
Discussion: https://www.postgresql.org/message-id/20170405010525.rt5azbya5fkbhvrx@alap3.anarazel.de
The compiler is entitled to store a char[] local variable with no
particular alignment requirement. Our RADIUS code cavalierly took such
a local variable and cast its address to a struct type that does have
alignment requirements. On an alignment-picky machine this would lead
to bus errors. To fix, declare the local variable honestly, and then
cast its address to char * for use in the I/O calls.
Given the lack of field complaints, there must be very few if any
people affected; but nonetheless this is a clear portability issue,
so back-patch to all supported branches.
Noted while looking at a Coverity complaint in the same code.
This reverts commit 38bdba54a6, "Fix and
simplify check for whether we're running as Windows service". It turns out
that older versions of MinGW - like that on buildfarm member narwhal - do
not support the CheckTokenMembership() function. This replaces the
refactoring with a much smaller fix, to add a check for SE_GROUP_ENABLED to
pgwin32_is_service().
Only apply to back-branches, and keep the refactoring in HEAD. It's
unlikely that anyone is still really using such an old version of MinGW -
aside from narwhal - but let's not change the minimum requirements in
minor releases.
Discussion: https://www.postgresql.org/message-id/16609.1489773427@sss.pgh.pa.us
Patch: https://www.postgresql.org/message-id/CAB7nPqSvfu%3DKpJ%3DNX%2BYAHmgAmQdzA7N5h31BjzXeMgczhGCC%2BQ%40mail.gmail.com
If the process token contains SECURITY_SERVICE_RID, but it has been
disabled by the SE_GROUP_USE_FOR_DENY_ONLY attribute, win32_is_service()
would incorrectly report that we're running as a service. That situation
arises, e.g. if postmaster is launched with a restricted security token,
with the "Log in as Service" privilege explicitly removed.
Replace the broken code with CheckProcessTokenMembership(), which does
this correctly. Also replace similar code in win32_is_admin(), even
though it got this right, for simplicity and consistency.
Per bug #13755, reported by Breen Hagan. Back-patch to all supported
versions. Patch by Takayuki Tsunakawa, reviewed by Michael Paquier.
Discussion: https://www.postgresql.org/message-id/20151104062315.2745.67143%40wrigleys.postgresql.org
presence of page pins, which leads to serious estimation errors in the
planner. This particularly affects small heavily-accessed tables,
especially where locking (e.g. from FK constraints) forces frequent
vacuums for mxid cleanup.
Fix by keeping separate track of pages whose live tuples were actually
counted vs. pages that were only scanned for freezing purposes. Thus,
reltuples can only be set to 0 if all pages of the relation were
actually counted.
Backpatch to all supported versions.
Per bug #14057 from Nicolas Baccelli, analyzed by me.
Discussion: https://postgr.es/m/20160331103739.8956.94469@wrigleys.postgresql.org
This could result in corruption of the init fork of an unlogged index
if the ambuildempty routine for that index used shared buffers to
create the init fork, which was true for gin, gist, and hash indexes.
Patch by me, based on an earlier patch by Michael Paquier, who also
reviewed this one. This also incorporates an idea from Artur
Zakirov.
Discussion: http://postgr.es/m/CACYUyc8yccE4xfxhqxfh_Mh38j7dRFuxfaK1p6dSNAEUakxUyQ@mail.gmail.com
Several places in fd.c had badly-thought-through handling of error returns
from lseek() and close(). The fact that those would seldom fail on valid
FDs is probably the reason we've not noticed this up to now; but if they
did fail, we'd get quite confused.
LruDelete and LruInsert actually just Assert'd that lseek never fails,
which is pretty awful on its face.
In LruDelete, we indeed can't throw an error, because that's likely to get
called during error abort and so throwing an error would probably just lead
to an infinite loop. But by the same token, throwing an error from the
close() right after that was ill-advised, not to mention that it would've
left the LRU state corrupted since we'd already unlinked the VFD from the
list. I also noticed that really, most of the time, we should know the
current seek position and it shouldn't be necessary to do an lseek here at
all. As patched, if we don't have a seek position and an lseek attempt
doesn't give us one, we'll close the file but then subsequent re-open
attempts will fail (except in the somewhat-unlikely case that a
FileSeek(SEEK_SET) call comes between and allows us to re-establish a known
target seek position). This isn't great but it won't result in any state
corruption.
Meanwhile, having an Assert instead of an honest test in LruInsert is
really dangerous: if that lseek failed, a subsequent read or write would
read or write from the start of the file, not where the caller expected,
leading to data corruption.
In both LruDelete and FileClose, if close() fails, just LOG that and mark
the VFD closed anyway. Possibly leaking an FD is preferable to getting
into an infinite loop or corrupting the VFD list. Besides, as far as I can
tell from the POSIX spec, it's unspecified whether or not the file has been
closed, so treating it as still open could be the wrong thing anyhow.
I also fixed a number of other places that were being sloppy about
behaving correctly when the seekPos is unknown.
Also, I changed FileSeek to return -1 with EINVAL for the cases where it
detects a bad offset, rather than throwing a hard elog(ERROR). It seemed
pretty inconsistent that some bad-offset cases would get a failure return
while others got elog(ERROR). It was missing an offset validity check for
the SEEK_CUR case on a closed file, too.
Back-patch to all supported branches, since all this code is fundamentally
identical in all of them.
Discussion: https://postgr.es/m/2982.1487617365@sss.pgh.pa.us
The loops in ExecHashJoinNewBatch(), ExecHashIncreaseNumBatches(), and
ExecHashRemoveNextSkewBucket() are all capable of iterating over many
tuples without ever doing a CHECK_FOR_INTERRUPTS, so that the backend
might fail to respond to SIGINT or SIGTERM for an unreasonably long time.
Fix that. In the case of ExecHashJoinNewBatch(), it seems useful to put
the added CHECK_FOR_INTERRUPTS into ExecHashJoinGetSavedTuple() rather
than directly in the loop, because that will also ensure that both
principal code paths through ExecHashJoinOuterGetTuple() will do a
CHECK_FOR_INTERRUPTS, which seems like a good idea to avoid surprises.
Back-patch to all supported branches.
Tom Lane and Thomas Munro
Discussion: https://postgr.es/m/6044.1487121720@sss.pgh.pa.us
The ALTER TABLE ALTER TYPE implementation can issue DROP INDEX and
CREATE INDEX to refit existing indexes for the new column type. Since
this CREATE INDEX is an implementation detail of an index alteration,
the ensuing DefineIndex() should skip ACL checks specific to index
creation. It already skips the namespace ACL check. Make it skip the
tablespace ACL check, too. Back-patch to 9.2 (all supported versions).
Reviewed by Tom Lane.
The problem with the original coding here is that we might receive (and
clear) a relcache invalidation signal for the target relation down inside
one of the index_open calls we're doing. Since the target is open, we
would not drop the relcache entry, just reset its rd_indexvalid and
rd_indexlist fields. But RelationGetIndexAttrBitmap() kept going, and
would eventually cache and return potentially-obsolete attribute bitmaps.
The case where this matters is where the inval signal was from a CREATE
INDEX CONCURRENTLY telling us about a new index on a formerly-unindexed
column. (In all other cases, the lock we hold on the target rel should
prevent any concurrent change in index state.) Even just returning the
stale attribute bitmap is not such a problem, because it shouldn't matter
during the transaction in which we receive the signal. What hurts is
caching the stale data, because it can survive into later transactions,
breaking CREATE INDEX CONCURRENTLY's expectation that later transactions
will not create new broken HOT chains. The upshot is that there's a window
for building corrupted indexes during CREATE INDEX CONCURRENTLY.
This patch fixes the problem by rechecking that the set of index OIDs
is still the same at the end of RelationGetIndexAttrBitmap() as it was
at the start. If not, we loop back and try again. That's a little
more than is strictly necessary to fix the bug --- in principle, we
could return the stale data but not cache it --- but it seems like a
bad idea on general principles for relcache to return data it knows
is stale.
There might be more hazards of the same ilk, or there might be a better
way to fix this one, but this patch definitely improves matters and seems
unlikely to make anything worse. So let's push it into today's releases
even as we continue to study the problem.
Pavan Deolasee and myself
Discussion: https://postgr.es/m/CABOikdM2MUq9cyZJi1KyLmmkCereyGp5JQ4fuwKoyKEde_mzkQ@mail.gmail.com