Back in 2003 we had a discussion about how to decide which casts to dump.
At the time pg_dump really only considered an object's containing schema
to decide what to dump (ie, dump whatever's not in pg_catalog), and so
we chose a complicated idea involving whether the underlying types were to
be dumped (cf commit a6790ce857). But users
are allowed to create casts between built-in types, and we failed to dump
such casts. Let's get rid of that heuristic, which has accreted even more
ugliness since then, in favor of just looking at the cast's OID to decide
if it's a built-in cast or not.
In passing, also fix some really ancient code that supposed that it had to
manufacture a dependency for the cast on its cast function; that's only
true when dumping from a pre-7.3 server. This just resulted in some wasted
cycles and duplicate dependency-list entries with newer servers, but we
might as well improve it.
Per gripes from a number of people, most recently Greg Sabino Mullane.
Back-patch to all supported branches.
Since 9.3, when the --jobs option was introduced, using it together
with the --serializable-deferrable option generated multiple
errors. We can get correct behavior by allowing the connection
which acquires the snapshot to use SERIALIZABLE, READ ONLY,
DEFERRABLE and pass that to the workers running the other
connections using REPEATABLE READ, READ ONLY. This is a bit of a
kluge since the SERIALIZABLE behavior is achieved by running some
of the participating connections at a different isolation level,
but it is a simple and safe change, suitable for back-patching.
This will be followed by a proposal for a more invasive fix with
some slight behavioral changes on just the master branch, based on
suggestions from Andres Freund, but the kluge will be applied to
master until something is agreed along those lines.
Back-patched to 9.3, where the --jobs option was added.
Based on report from Alexander Korotkov
pg_dump.c:dumDatabase() called ArchiveEntry() with the results of a a
query that was PQclear()ed a couple lines earlier.
Backpatch to 9.2 where security labels for shared objects where
introduced.
This never worked, I think. Per report from Marc Munro.
In passing, fix funny spacing in the COMMENT ON command as a result of
excess space in the "label" string.
In passing, also make some debugging elog's in pgstat.c a bit more
consistently worded.
Back-patch as far as applicable (9.3 or 9.4; none of these mistakes are
really old).
Mark Dilger identified and patched the type violations; the message
rewordings are mine.
pg_dump/parallel.c was using realloc() directly with no error check.
While the odds of an actual failure here seem pretty low, Coverity
complains about it, so fix by using pg_realloc() instead.
While looking for other instances, I noticed a couple of places in
psql that hadn't gotten the memo about the availability of pg_realloc.
These aren't bugs, since they did have error checks, but verbosely
inconsistent code is not a good thing.
Back-patch as far as 9.3. 9.2 did not have pg_dump/parallel.c, nor
did it have pg_realloc available in all frontend code.
Fix breakage induced by commits d8d3d2a4f3
and 463f2625a5: pg_dumpall has crashed when
attempting to dump from pre-8.1 servers since then, due to faulty
construction of the query used for dumping roles from older servers.
The query was erroneous as of the earlier commit, but it wasn't exposed
unless you tried to use --binary-upgrade, which you presumably wouldn't
with a pre-8.1 server. However commit 463f2625a made it fail always.
In HEAD, also fix additional breakage induced in the same query by
commit 491c029dbc, which evidently wasn't
tested against pre-8.1 servers either.
The bug is only latent in 9.1 because 463f2625a hadn't landed yet, but
it seems best to back-patch all branches containing the faulty query.
Gilles Darold
This reverts nearly all of commit 28f6cab61a
in favor of just using the typrelid we already have in pg_dump's TypeInfo
struct for the composite type. As coded, it'd crash if the composite type
had no attributes, since then the query would return no rows.
Back-patch to all supported versions. It seems to not really be a problem
in 9.0 because that version rejects the syntax "create type t as ()", but
we might as well keep the logic similar in all affected branches.
Report and fix by Rushabh Lathia.
Without this fix, parallel restore of a schema-only dump can deadlock,
because when the dump is schema-only, the dependency will still be
pointing at the TABLE item rather than the TABLE DATA item.
Robert Haas and Tom Lane
findDependencyLoops() was not bright about cases where there are multiple
dependency paths between the same two dumpable objects. In most scenarios
this did not hurt us too badly; but since the introduction of section
boundary pseudo-objects in commit a1ef01fe16,
it was possible for this code to take unreasonable amounts of time (tens
of seconds on a database with a couple thousand objects), as reported in
bug #11033 from Joe Van Dyk. Joe's particular problem scenario involved
"pg_dump -a" mode with long chains of foreign key constraints, but I think
that similar problems could arise with other situations as long as there
were enough objects. To fix, add a flag array that lets us notice when we
arrive at the same object again while searching from a given start object.
This simple change seems to be enough to eliminate the performance problem.
Back-patch to 9.1, like the patch that introduced section boundary objects.
Prior to 9.0, pg_dump handled comments on large objects by dumping a bunch
of COMMENT commands into a single BLOB COMMENTS archive object. With
sufficiently many such comments, some of the commands would likely get
split across bufferloads when restoring, causing failures in
direct-to-database restores (though no problem would be evident in text
output). This is the same type of issue we have with table data dumped as
INSERT commands, and it can be fixed in the same way, by using a mini SQL
lexer to figure out where the command boundaries are. Fortunately, the
COMMENT commands are no more complex to lex than INSERTs, so we can just
re-use the existing lexer for INSERTs.
Per bug #10611 from Jacek Zalewski. Back-patch to all active branches.
This was not changed in HEAD, but will be done later as part of a
pgindent run. Future pgindent runs will also do this.
Report by Tom Lane
Backpatch through all supported branches, but not HEAD
According to the Single Unix Spec and assorted man pages, you're supposed
to use the constants named AF_xxx when setting ai_family for a getaddrinfo
call. In a few places we were using PF_xxx instead. Use of PF_xxx
appears to be an ancient BSD convention that was not adopted by later
standardization. On BSD and most later Unixen, it doesn't matter much
because those constants have equivalent values anyway; but nonetheless
this code is not per spec.
In the same vein, replace PF_INET by AF_INET in one socket() call, which
wasn't even consistent with the other socket() call in the same function
let alone the remainder of our code.
Per investigation of a Cygwin trouble report from Marco Atzeri. It's
probably a long shot that this will fix his issue, but it's wrong in
any case.
It is possible for a view or materialized view to depend on a table's
primary key, if the view query relies on functional dependency to
abbreviate a GROUP BY list. This is problematic for pg_dump since we
ordinarily want to dump view definitions in the pre-data section but
indexes in post-data. pg_dump knows how to deal with this situation for
regular views, by breaking the view's ON SELECT rule apart from the view
proper. But it had not been taught what to do about materialized views,
and in fact mistakenly dumped them as regular views in such cases, as
seen in bug #9616 from Jesse Denardo.
If we had CREATE OR REPLACE MATERIALIZED VIEW, we could fix this in a
manner analogous to what's done for regular views; but we don't yet,
and we'd not back-patch such a thing into 9.3 anyway. As a hopefully-
temporary workaround, break the circularity by postponing the matview
into post-data altogether when this case occurs.
Clear errno before calling readdir() and handle old MinGW errno bug
while adding full test coverage for readdir/closedir failures.
Backpatch through 8.4.
During parallel pg_dump, a worker process closing the connection caused
a minor memory leak (particularly minor as we are likely about to exit
anyway). Instead, free the memory in this case prior to returning NULL
to indicate connection closed.
Spotting by the Coverity scanner.
Back patch to 9.3 where this was introduced.
CREATE EVENT TRIGGER forgot to mark the event trigger as a member of its
extension, and pg_dump didn't pay any attention anyway when deciding
whether to dump the event trigger. Per report from Moshe Jacobson.
Given the obvious lack of testing here, it's rather astonishing that
ALTER EXTENSION ADD/DROP EVENT TRIGGER work, but they seem to.
There was an apparent attempt to limit the target database for
pg_restore to version 7.1.0 or later. Due to a leading zero this
was interpreted as an octal number, which allowed targets with
version numbers down to 2.87.36. The lowest actual release above
that was 6.0.0, so that was effectively the limit.
Since the success of the restore attempt will depend primarily on
on what statements were generated by the dump run, we don't want
pg_restore trying to guess whether a given target should be allowed
based on version number. Allow a connection to any version. Since
it is very unlikely that anyone would be using a recent version of
pg_restore to restore to a pre-6.0 database, this has little to no
practical impact, but it makes the code less confusing to read.
Issue reported and initial patch suggestion from Joel Jacobson
based on an article by Andrey Karpov reporting on issues found by
PVS-Studio static code analyzer. Final patch based on analysis by
Tom Lane. Back-patch to all supported branches.
pg_dumpall's charter is to be able to recreate a database cluster's
contents in a virgin installation, but it was failing to honor that
contract if the cluster had any ALTER DATABASE SET
default_transaction_read_only settings. By including a SET command
for the connection for each connection opened by pg_dumpall output,
errors are avoided and the source cluster is successfully
recreated.
There was discussion of whether to also set this for the connection
applying pg_dump output, but it was felt that it was both less
appropriate in that context, and far easier to work around.
Backpatch to all supported branches.
In pg_dump.c:getEventTriggers, check what major version we are on
before calling createPQExpBuffer() to avoid leaking that bit of
memory.
Leak discovered by the Coverity scanner.
Back-patch to 9.3 where support for dumping event triggers was
added.
The command strings read by the child processes during parallel
pg_dump, after being read and handled, were not being free'd.
This patch corrects this relatively minor memory leak.
Leak found by the Coverity scanner.
Back patch to 9.3 where parallel pg_dump was introduced.
When there's a comment on an index that was created with UNIQUE or PRIMARY
KEY constraint syntax, we need to label the comment as depending on the
constraint not the index, since only the constraint object actually appears
in the dump. This incorrect dependency can lead to parallel pg_restore
trying to restore the comment before the index has been created, per bug
#8257 from Lloyd Albin.
This patch fixes pg_dump to produce the right dependency in dumps made
in the future. Usually we also try to hack pg_restore to work around
bogus dependencies, so that existing (wrong) dumps can still be restored in
parallel mode; but that doesn't seem practical here since there's no easy
way to relate the constraint dump entry to the comment after the fact.
Andres Freund
In binary upgrade mode, we need to recreate and then drop dropped
columns so that all the columns get the right attribute number. This is
true for foreign tables as well as for native tables. For foreign
tables we have been getting the first part right but not the second,
leading to bogus columns in the upgraded database. Fix this all the way
back to 9.1, where foreign tables were introduced.
getSchemaData() must identify extension member objects and mark them
as not to be dumped. This must happen after reading all objects that can be
direct members of extensions, but before we begin to process table subsidiary
objects. Both rules and event triggers were wrong in this regard.
Backport rules portion of patch to 9.1 -- event triggers do not exist prior to 9.3.
Suggested fix by Tom Lane, initial complaint and patch by me.
Previously this state was represented by whether the view's disk file had
zero or nonzero size, which is problematic for numerous reasons, since it's
breaking a fundamental assumption about heap storage. This was done to
allow unlogged matviews to revert to unpopulated status after a crash
despite our lack of any ability to update catalog entries post-crash.
However, this poses enough risk of future problems that it seems better to
not support unlogged matviews until we can find another way. Accordingly,
revert that choice as well as a number of existing kluges forced by it
in favor of creating a pg_class.relispopulated flag column.
The intent was that being populated would, long term, be just one
of the conditions which could affect whether a matview was
scannable; being populated should be necessary but not always
sufficient to scan the relation. Since only CREATE and REFRESH
currently determine the scannability, names and comments
accidentally conflated these concepts, leading to confusion.
Also add missing locking for the SQL function which allows a
test for scannability, and fix a modularity violatiion.
Per complaints from Tom Lane, although its not clear that these
will satisfy his concerns. Hopefully this will at least better
frame the discussion.
9.2 uses a kluge representation of "indislive"; we have to account for
that when examining pg_index. Simplest solution is to check indisready
for 9.0 and 9.1 as well; that's harmless though unnecessary, so it's
not worth making a version distinction for.
Fixes oversight in commit 683abc73df,
as noted by Andres Freund.
Move functions used only by pg_dump and pg_restore from dumputils.c to a new
file, pg_backup_utils.c. dumputils.c is linked into psql and some programs
in bin/scripts, so it seems good to keep it slim. The parallel functionality
is moved to parallel.c, as is exit_horribly, because the interesting code in
exit_horribly is parallel-related.
This refactoring gets rid of the on_exit_msg_func function pointer. It was
problematic, because a modern gcc version with -Wmissing-format-attribute
complained if it wasn't marked with PF_PRINTF_ATTRIBUTE, but the ancient gcc
version that Tom Lane's old HP-UX box has didn't accept that attribute on a
function pointer, and gave an error. We still use a similar function pointer
trick for getLocalPQBuffer() function, to use a thread-local version of that
in parallel mode on Windows, but that dodges the problem because it doesn't
take printf-like arguments.