Commit graph

24397 commits

Author SHA1 Message Date
Tom Lane
86dab9c8ad Retry after buffer locking failure during SPGiST index creation.
The original coding thought this case was impossible, but it can happen
if the bgwriter or checkpointer processes decide to write out an index
page while creation is still proceeding, leading to a bogus "unexpected
spgdoinsert() failure" error.  Problem reported by Jonathan S. Katz.

Teodor Sigaev
2013-11-02 16:45:52 -04:00
Tom Lane
14d4548f1c Ensure all files created for a single BufFile have the same resource owner.
Callers expect that they only have to set the right resource owner when
creating a BufFile, not during subsequent operations on it.  While we could
insist this be fixed at the caller level, it seems more sensible for the
BufFile to take care of it.  Without this, some temp files belonging to
a BufFile can go away too soon, eg at the end of a subtransaction,
leading to errors or crashes.

Reported and fixed by Andres Freund.  Back-patch to all active branches.
2013-11-01 16:09:52 -04:00
Tom Lane
2650c5cf4b Fix some odd behaviors when using a SQL-style simple GMT offset timezone.
Formerly, when using a SQL-spec timezone setting with a fixed GMT offset
(called a "brute force" timezone in the code), the session_timezone
variable was not updated to match the nominal timezone; rather, all code
was expected to ignore session_timezone if HasCTZSet was true.  This is
of course obviously fragile, though a search of the code finds only
timeofday() failing to honor the rule.  A bigger problem was that
DetermineTimeZoneOffset() supposed that if its pg_tz parameter was
pointer-equal to session_timezone, then HasCTZSet should override the
parameter.  This would cause datetime input containing an explicit zone
name to be treated as referencing the brute-force zone instead, if the
zone name happened to match the session timezone that had prevailed
before installing the brute-force zone setting (as reported in bug #8572).
The same malady could affect AT TIME ZONE operators.

To fix, set up session_timezone so that it matches the brute-force zone
specification, which we can do using the POSIX timezone definition syntax
"<abbrev>offset", and get rid of the bogus lookaside check in
DetermineTimeZoneOffset().  Aside from fixing the erroneous behavior in
datetime parsing and AT TIME ZONE, this will cause the timeofday() function
to print its result in the user-requested time zone rather than some
previously-set zone.  It might also affect results in third-party
extensions, if there are any that make use of session_timezone without
considering HasCTZSet, but in all cases the new behavior should be saner
than before.

Back-patch to all supported branches.
2013-11-01 12:13:23 -04:00
Tom Lane
27deb0480c Prevent using strncpy with src == dest in TupleDescInitEntry.
The C and POSIX standards state that strncpy's behavior is undefined when
source and destination areas overlap.  While it remains dubious whether any
implementations really misbehave when the pointers are exactly equal, some
platforms are now starting to force the issue by complaining when an
undefined call occurs.  (In particular OS X 10.9 has been seen to dump core
here, though the exact set of circumstances needed to trigger that remain
elusive.  Similar behavior can be expected to be optional on Linux and
other platforms in the near future.)  So tweak the code to explicitly do
nothing when nothing need be done.

Back-patch to all active branches.  In HEAD, this also lets us get rid of
an exception in valgrind.supp.

Per discussion of a report from Matthias Schmitt.
2013-10-28 20:49:28 -04:00
Tom Lane
01c1b1aa25 Improve documentation about usage of FDW validator functions.
SGML documentation, as well as code comments, failed to note that an FDW's
validator will be applied to foreign-table options for foreign tables using
the FDW.

Etsuro Fujita
2013-10-28 10:30:10 -04:00
Heikki Linnakangas
bb604d03ab Plug memory leak when reloading config file.
The absolute path to config file was not pfreed. There are probably more
small leaks here and there in the config file reload code and assign hooks,
and in practice no-one reloads the config files frequently enough for it to
be a problem, but this one is trivial enough that might as well fix it.

Backpatch to 9.3 where the leak was introduced.
2013-10-24 15:28:15 +03:00
Heikki Linnakangas
b0aa17706e Fix memory leak when an empty ident file is reloaded.
Hari Babu
2013-10-24 14:05:34 +03:00
Heikki Linnakangas
80eba5981e Fix typos in comments. 2013-10-24 11:50:41 +03:00
Heikki Linnakangas
f90d7426ed Fix two bugs in setting the vm bit of empty pages.
Use a critical section when setting the all-visible flag on an empty page,
and WAL-logging it. log_newpage_buffer() contains an assertion that it
must be called inside a critical section, and it's the right thing to do
when modifying a buffer anyway.

Also, the page should be marked dirty before calling log_newpage_buffer(),
per the comment in log_newpage_buffer() and src/backend/access/transam/README.

Patch by Andres Freund, in response to my report. Backpatch to 9.2, like
the patch that introduced these bugs (a6370fd9).
2013-10-23 14:25:43 +03:00
Peter Eisentraut
627f216566 Add libpgcommon to backend gettext source files
This ought to have been done when libpgcommon was split off from
libpgport.
2013-10-21 06:20:05 -04:00
Peter Eisentraut
b7f59e6d3e Stamp 9.3.1. 2013-10-07 23:17:38 -04:00
Peter Eisentraut
9fba0cbee3 Revert "Backpatch pgxs vpath build and installation fixes."
This reverts commit f8110c5f66.

pending resolution of
http://www.postgresql.org/message-id/1381193255.25702.4.camel@vanquo.pezone.net
2013-10-07 22:32:04 -04:00
Peter Eisentraut
4b41460675 Revert "Ensure installation dirs are built before contents are installed (v2)"
This reverts commit 7f165f2587.

pending resolution of
http://www.postgresql.org/message-id/1381193255.25702.4.camel@vanquo.pezone.net
2013-10-07 22:31:31 -04:00
Heikki Linnakangas
cc736ed91f Fix bugs in SSI tuple locking.
1. In heap_hot_search_buffer(), the PredicateLockTuple() call is passed
wrong offset number. heapTuple->t_self is set to the tid of the first
tuple in the chain that's visited, not the one actually being read.

2. CheckForSerializableConflictIn() uses the tuple's t_ctid field
instead of t_self to check for exiting predicate locks on the tuple. If
the tuple was updated, but the updater rolled back, t_ctid points to the
aborted dead tuple.

Reported by Hannu Krosing. Backpatch to 9.1.
2013-10-08 00:03:24 +03:00
Peter Eisentraut
4750eae350 Translation updates 2013-10-07 16:27:04 -04:00
Kevin Grittner
ef388b65a8 Eliminate xmin from hash tag for predicate locks on heap tuples.
If a tuple was frozen while its predicate locks mattered,
read-write dependencies could be missed, resulting in failure to
detect conflicts which could lead to anomalies in committed
serializable transactions.

This field was added to the tag when we still thought that it was
necessary to carry locks forward to a new version of an updated
row.  That was later proven to be unnecessary, which allowed
simplification of the code, but elimination of xmin from the tag
was missed at the time.

Per report and analysis by Heikki Linnakangas.
Backpatch to 9.1.
2013-10-07 14:26:54 -05:00
Alvaro Herrera
3078f21431 add multixact-no-deadlock to schedule 2013-10-04 15:59:22 -03:00
Alvaro Herrera
0ac659d4a5 Make some isolationtester specs more complete
Also, make sure they pass on all transaction isolation levels.
2013-10-04 15:59:22 -03:00
Alvaro Herrera
eb8a811c5a isolationtester: Allow tuples to be returned in more places
Previously, isolationtester would forbid returning tuples in
session-specific teardown (but not global teardown), as well as in
global setup.  Allow these places to return tuples, too.
2013-10-04 15:59:21 -03:00
Andrew Dunstan
7f165f2587 Ensure installation dirs are built before contents are installed (v2)
Push dependency on installdirs down to individual targets.

Christoph Berg
2013-09-30 10:19:56 -04:00
Heikki Linnakangas
f609d07438 Fix snapshot leak if lo_open called on non-existent object.
lo_open registers the currently active snapshot, and checks if the
large object exists after that. Normally, snapshots registered by lo_open
are unregistered at end of transaction when the lo descriptor is closed, but
if we error out before the lo descriptor is added to the list of open
descriptors, it is leaked. Fix by moving the snapshot registration to after
checking if the large object exists.

Reported by Pavel Stehule. Backpatch to 8.4. The snapshot registration
system was introduced in 8.4, so prior versions are not affected (and not
supported, anyway).
2013-09-30 12:53:56 +03:00
Andrew Dunstan
f8110c5f66 Backpatch pgxs vpath build and installation fixes.
This is a backpatch of commits d942f9d9, 82b01026, and 6697aa2bc, back
to release 9.1 where we introduced extensions which make heavy use of
the PGXS infrastructure.
2013-09-29 17:28:16 -04:00
Heikki Linnakangas
5bdf02cd29 Fix spurious warning after vacuuming a page on a table with no indexes.
There is a rare race condition, when a transaction that inserted a tuple
aborts while vacuum is processing the page containing the inserted tuple.
Vacuum prunes the page first, which normally removes any dead tuples, but
if the inserting transaction aborts right after that, the loop after
pruning will see a dead tuple and remove it instead. That's OK, but if the
page is on a table with no indexes, and the page becomes completely empty
after removing the dead tuple (or tuples) on it, it will be immediately
marked as all-visible. That's OK, but the sanity check in vacuum would
throw a warning because it thinks that the page contains dead tuples and
was nevertheless marked as all-visible, even though it just vacuumed away
the dead tuples and so it doesn't actually contain any.

Spotted this while reading the code. It's difficult to hit the race
condition otherwise, but can be done by putting a breakpoint after the
heap_page_prune() call.

Backpatch all the way to 8.4, where this code first appeared.
2013-09-26 11:38:15 +03:00
Heikki Linnakangas
3393675282 Plug memory leak in range_cmp function.
B-tree operators are not allowed to leak memory into the current memory
context. Range_cmp leaked detoasted copies of the arguments. That caused
a quick out-of-memory error when creating an index on a range column.

Reported by Marian Krucina, bug #8468.
2013-09-25 16:05:24 +03:00
Alvaro Herrera
ea9a2bcea3 Fix pgindent comment breakage 2013-09-24 18:21:28 -03:00
Noah Misch
98a746c1ea Use @libdir@ in both of regress/{input,output}/security_label.source
Though @libdir@ almost always matches @abs_builddir@ in this context,
the test could only fail if they differed.  Back-patch to 9.1, where the
test was introduced.

Hamid Quddus Akhtar
2013-09-23 16:00:47 -04:00
Stephen Frost
73c4e527a4 Fix SSL deadlock risk in libpq
In libpq, we set up and pass to OpenSSL callback routines to handle
locking.  When we run out of SSL connections, we try to clean things
up by de-registering the hooks.  Unfortunately, we had a few calls
into the OpenSSL library after these hooks were de-registered during
SSL cleanup which lead to deadlocking.  This moves the thread callback
cleanup to be after all SSL-cleanup related OpenSSL library calls.
I've been unable to reproduce the deadlock with this fix.

In passing, also move the close_SSL call to be after unlocking our
ssl_config mutex when in a failure state.  While it looks pretty
unlikely to be an issue, it could have resulted in deadlocks if we
ended up in this code path due to something other than SSL_new
failing.  Thanks to Heikki for pointing this out.

Back-patch to all supported versions; note that the close_SSL issue
only goes back to 9.0, so that hunk isn't included in the 8.4 patch.

Initially found and reported by Vesa-Matti J Kari; many thanks to
both Heikki and Andres for their help running down the specific
issue and reviewing the patch.
2013-09-23 08:42:37 -04:00
Heikki Linnakangas
62ff6556ab Fix two timeline handling bugs in pg_receivexlog.
When a timeline history file is fetched from server, it is initially created
with a temporary file name, and renamed to place. However, the temporary
file name was constructed using an uninitialized buffer. Usually that meant
that the file was created in current directory instead of the target, which
usually goes unnoticed, but if the target is on a different filesystem than
the current dir, the rename() would fail. Fix that.

The second issue is that pg_receivexlog would not take .partial files into
account when determining when scanning the target directory for existing
WAL files. If the timeline has switched in the server several times in the
last WAL segment, and pg_receivexlog is restarted, it would choose a too
old starting point. That's not a problem as long as the old WAL segment
exists in the server and can be streamed over, but will cause a failure if
it's not.

Backpatch to 9.3, where this timeline handling code was written.

Analysed by Andrew Gierth, bug #8453, based on a bug report on IRC.
2013-09-23 10:40:25 +03:00
Alvaro Herrera
3451faaec8 Rename various "freeze multixact" variables
It seems to make more sense to use "cutoff multixact" terminology
throughout the backend code; "freeze" is associated with replacing of an
Xid with FrozenTransactionId, which is not what we do for MultiXactIds.

Andres Freund
Some adjustments by Álvaro Herrera
2013-09-16 15:56:11 -03:00
Noah Misch
374652fb6d Ignore interrupts during quickdie().
Once the administrator has called for an immediate shutdown or a backend
crash has triggered a reinitialization, no mere SIGINT or SIGTERM should
change that course.  Such derailment remains possible when the signal
arrives before quickdie() blocks signals.  That being a narrow race
affecting most PostgreSQL signal handlers in some way, leave it for
another patch.  Back-patch this to all supported versions.
2013-09-11 20:14:07 -04:00
Michael Meskes
1eea0ebddc Return error if allocation of new element was not possible.
Found by Coverity.
2013-09-08 13:13:03 +02:00
Michael Meskes
3560dbcaac Close file to no leak file descriptor memory. Found by Coverity. 2013-09-08 13:13:03 +02:00
Tom Lane
69876085d6 Don't fail for bad GUCs in CREATE FUNCTION with check_function_bodies off.
The previous coding attempted to activate all the GUC settings specified
in SET clauses, so that the function validator could operate in the GUC
environment expected by the function body.  However, this is problematic
when restoring a dump, since the SET clauses might refer to database
objects that don't exist yet.  We already have the parameter
check_function_bodies that's meant to prevent forward references in
function definitions from breaking dumps, so let's change CREATE FUNCTION
to not install the SET values if check_function_bodies is off.

Authors of function validators were already advised not to make any
"context sensitive" checks when check_function_bodies is off, if indeed
they're checking anything at all in that mode.  But extend the
documentation to point out the GUC issue in particular.

(Note that we still check the SET clauses to some extent; the behavior
with !check_function_bodies is now approximately equivalent to what ALTER
DATABASE/ROLE have been doing for awhile with context-dependent GUCs.)

This problem can be demonstrated in all active branches, so back-patch
all the way.
2013-09-03 18:32:23 -04:00
Alvaro Herrera
30b5b3b917 Update obsolete comment 2013-09-03 16:54:03 -04:00
Tom Lane
da645b3a73 Stamp 9.3.0. 2013-09-02 16:53:17 -04:00
Tom Lane
e3d02a10ec Update time zone data files to tzdata release 2013d.
DST law changes in Israel, Morocco, Palestine, Paraguay.
Historical corrections for Macquarie Island.
2013-09-02 15:06:30 -04:00
Peter Eisentraut
c7ef895f69 Translation updates 2013-09-02 02:28:21 -04:00
Tom Lane
3c2a425da7 Improve regression test for #8410.
The previous version of the query disregarded the result of the MergeAppend
instead of checking its results.

Andres Freund
2013-08-30 21:40:49 -04:00
Tom Lane
ce58aad2ba Add test case for bug #8410.
Per Andres Freund.
2013-08-30 19:27:47 -04:00
Tom Lane
16e8e36ceb Reset the binary heap in MergeAppend rescans.
Failing to do so can cause queries to return wrong data, error out or crash.
This requires adding a new binaryheap_reset() method to binaryheap.c,
but that probably should have been there anyway.

Per bug #8410 from Terje Elde.  Diagnosis and patch by Andres Freund.
2013-08-30 19:15:32 -04:00
Alvaro Herrera
dfed97b744 Make error wording more consistent 2013-08-29 13:11:08 -04:00
Andrew Dunstan
e536d47ab7 Unconditionally use the WSA equivalents of Socket error constants.
This change will only apply to mingw compilers, and has been found
necessary by late versions of the mingw-w64 compiler. It's the same as
what is done elsewhere for the Microsoft compilers.

Backpatch of commit 73838b5251.

Problem reported by Michael Cronenworth, although not his patch.
2013-08-26 14:58:14 -04:00
Tom Lane
a5f11e24a4 Account better for planning cost when choosing whether to use custom plans.
The previous coding in plancache.c essentially used 10% of the estimated
runtime as its cost estimate for planning.  This can be pretty bogus,
especially when the estimated runtime is very small, such as in a simple
expression plan created by plpgsql, or a simple INSERT ... VALUES.

While we don't have a really good handle on how planning time compares
to runtime, it seems reasonable to use an estimate based on the number of
relations referenced in the query, with a rather large multiplier.  This
patch uses 1000 * cpu_operator_cost * (nrelations + 1), so that even a
trivial query will be charged 1000 * cpu_operator_cost for planning.
This should address the problem reported by Marc Cousin and others that
9.2 and up prefer custom plans in cases where the planning time greatly
exceeds what can be saved.
2013-08-24 15:14:21 -04:00
Magnus Hagander
3cf89057b5 Don't crash when pg_xlog is empty and pg_basebackup -x is used
The backup will not work (without a logarchive, and that's the whole
point of -x) in this case, this patch just changes it to throw an
error instead of crashing when this happens.

Noticed and diagnosed by TAKATSUKA Haruka
2013-08-24 17:14:18 +02:00
Tom Lane
c19617d535 In locate_grouping_columns(), don't expect an exact match of Var typmods.
It's possible that inlining of SQL functions (or perhaps other changes?)
has exposed typmod information not known at parse time.  In such cases,
Vars generated by query_planner might have valid typmod values while the
original grouping columns only have typmod -1.  This isn't a semantic
problem since the behavior of grouping only depends on type not typmod,
but it breaks locate_grouping_columns' use of tlist_member to locate the
matching entry in query_planner's result tlist.

We can fix this without an excessive amount of new code or complexity by
relying on the fact that locate_grouping_columns only gets called when
make_subplanTargetList has set need_tlist_eval == false, and that can only
happen if all the grouping columns are simple Vars.  Therefore we only need
to search the sub_tlist for a matching Var, and we can reasonably define a
"match" as being a match of the Var identity fields
varno/varattno/varlevelsup.  The code still Asserts that vartype matches,
but ignores vartypmod.

Per bug #8393 from Evan Martin.  The added regression test case is
basically the same as his example.  This has been broken for a very long
time, so back-patch to all supported branches.
2013-08-23 17:30:56 -04:00
Tom Lane
e6d3f5b35e Fix hash table size estimation error in choose_hashed_distinct().
We should account for the per-group hashtable entry overhead when
considering whether to use a hash aggregate to implement DISTINCT.  The
comparable logic in choose_hashed_grouping() gets this right, but I think
I omitted it here in the mistaken belief that there would be no overhead
if there were no aggregate functions to be evaluated.  This can result in
more than 2X underestimate of the hash table size, if the tuples being
aggregated aren't very wide.  Per report from Tomas Vondra.

This bug is of long standing, but per discussion we'll only back-patch into
9.3.  Changing the estimation behavior in stable branches seems to carry too
much risk of destabilizing plan choices for already-tuned applications.
2013-08-21 13:38:38 -04:00
Tom Lane
ce52c6fe24 Stamp 9.3rc1. 2013-08-19 19:45:10 -04:00
Tom Lane
59bc4a43ec Be more wary of unwanted whitespace in pgstat_reset_remove_files().
sscanf isn't the easiest thing to use for exact pattern checks ...
also, don't use strncmp where strcmp will do.
2013-08-19 19:36:06 -04:00
Alvaro Herrera
db5b49cdd4 Fix removal of files in pgstats directories
Instead of deleting all files in stats_temp_directory and the permanent
directory on a crash, only remove those files that match the pattern of
files we actually write in them, to avoid possibly clobbering existing
unrelated contents of the temporary directory.  Per complaint from Jeff
Janes, and subsequent discussion, starting at message
CAMkU=1z9+7RsDODnT4=cDFBRBp8wYQbd_qsLcMtKEf-oFwuOdQ@mail.gmail.com

Also, fix a bug in the same routine to avoid removing files from the
permanent directory twice (instead of once from that directory and then
from the temporary directory), also per report from Jeff Janes, in
message
CAMkU=1wbk947=-pAosDMX5VC+sQw9W4ttq6RM9rXu=MjNeEQKA@mail.gmail.com
2013-08-19 17:52:52 -04:00
Heikki Linnakangas
38c69237c2 Rename the "fast_promote" file to just "promote".
This keeps the usual trigger file name unchanged from 9.2, avoiding nasty
issues if you use a pre-9.3 pg_ctl binary with a 9.3 server or vice versa.
The fallback behavior of creating a full checkpoint before starting up is now
triggered by a file called "fallback_promote". That can be useful for
debugging purposes, but we don't expect any users to have to resort to that
and we might want to remove that in the future, which is why the fallback
mechanism is undocumented.
2013-08-19 21:01:14 +03:00