Commit graph

25901 commits

Author SHA1 Message Date
Fujii Masao
a1b395b6a2 Add GUC and storage parameter to set the maximum size of GIN pending list.
Previously the maximum size of GIN pending list was controlled only by
work_mem. But the reasonable value of work_mem and the reasonable size
of the list are basically not the same, so it was not appropriate to
control both of them by only one GUC, i.e., work_mem. This commit
separates new GUC, pending_list_cleanup_size, from work_mem to allow
users to control only the size of the list.

Also this commit adds pending_list_cleanup_size as new storage parameter
to allow users to specify the size of the list per index. This is useful,
for example, when users want to increase the size of the list only for
the GIN index which can be updated heavily, and decrease it otherwise.

Reviewed by Etsuro Fujita.
2014-11-11 21:08:21 +09:00
Heikki Linnakangas
ae667f778d Really fix compilation failure on MIPS.
I missed an additional colon in previous patch. Oops. to make that mistake
less likely in the future, add comments as placeholders for unused inputs
and outputs in inline assembly.
2014-11-11 10:25:22 +02:00
Heikki Linnakangas
baf7b3a503 Fix compilation failure on MIPS.
Rémi Zara
2014-11-11 01:06:06 +02:00
Alvaro Herrera
a590f266e4 BRIN: fix bug in xlog backup block counting
The code that generates the BRIN_XLOG_UPDATE removes the buffer
reference when the page that's target for the updated tuple is freshly
initialized.  This is a pretty usual optimization, but was breaking the
case where the revmap buffer, which is referenced in the same WAL
record, is getting a backup block: the replay code was using backup
block index 1, which is not valid when the update target buffer gets
pruned; the revmap buffer gets assigned 0 instead.  Make sure to use the
correct backup block index for revmap when replaying.

Bug reported by Fujii Masao.
2014-11-10 18:13:49 -03:00
Robert Haas
c8df9477f8 Fix potential NULL-pointer dereference.
Commit 2781b4bea7 arranged to defer
the setup of after-trigger-related data structures, but
AfterTriggerPendingOnRel didn't get the memo.
2014-11-10 15:22:46 -05:00
Tom Lane
bf7ca15875 Ensure that RowExprs and whole-row Vars produce the expected column names.
At one time it wasn't terribly important what column names were associated
with the fields of a composite Datum, but since the introduction of
operations like row_to_json(), it's important that looking up the rowtype
ID embedded in the Datum returns the column names that users would expect.
That did not work terribly well before this patch: you could get the column
names of the underlying table, or column aliases from any level of the
query, depending on minor details of the plan tree.  You could even get
totally empty field names, which is disastrous for cases like row_to_json().

To fix this for whole-row Vars, look to the RTE referenced by the Var, and
make sure its column aliases are applied to the rowtype associated with
the result Datums.  This is a tad scary because we might have to return
a transient RECORD type even though the Var is declared as having some
named rowtype.  In principle it should be all right because the record
type will still be physically compatible with the named rowtype; but
I had to weaken one Assert in ExecEvalConvertRowtype, and there might be
third-party code containing similar assumptions.

Similarly, RowExprs have to be willing to override the column names coming
from a named composite result type and produce a RECORD when the column
aliases visible at the site of the RowExpr differ from the underlying
table's column names.

In passing, revert the decision made in commit 398f70ec07 to add
an alias-list argument to ExecTypeFromExprList: better to provide that
functionality in a separate function.  This also reverts most of the code
changes in d685814835, which we don't need because we're no longer
depending on the tupdesc found in the child plan node's result slot to be
blessed.

Back-patch to 9.4, but not earlier, since this solution changes the results
in some cases that users might not have realized were buggy.  We'll apply a
more restricted form of this patch in older branches.
2014-11-10 15:21:09 -05:00
Alvaro Herrera
1e0b4365c2 Further code and wording tweaks in BRIN
Besides a couple of typo fixes, per David Rowley, Thom Brown, and Amit
Langote, and mentions of BRIN in the general CREATE INDEX page again per
David, this includes silencing MSVC compiler warnings (thanks Microsoft)
and an additional variable initialization per Coverity scanner.
2014-11-10 15:56:08 -03:00
Kevin Grittner
96a73fcdac Fix compiler warning for non-assert builds.
Reported by Peter Geoghegan
David Rowley
2014-11-10 09:42:46 -06:00
Robert Haas
095d40123c Tab complete second argument to \c with role names.
Ian Barwick
2014-11-10 08:15:17 -05:00
Bruce Momjian
67067f9ae3 C comment: mention 1500-02-29 as an invalid date
It is invalid because the Gregorian calendar is used for all years.
2014-11-09 20:50:15 -05:00
Alvaro Herrera
b89ee54e20 Fix some coding issues in BRIN
Reported by David Rowley: variadic macros are a problem.  Get rid of
them using a trick suggested by Tom Lane: add extra parentheses where
needed.  In the future we might decide we don't need the calls at all
and remove them, but it seems appropriate to keep them while this code
is still new.

Also from David Rowley: brininsert() was trying to use a variable before
initializing it.  Fix by moving the brin_form_tuple call (which
initializes the variable) to within the locked section.

Reported by Peter Eisentraut: can't use "new" as a struct member name,
because C++ compilers will choke on it, as reported by cpluspluscheck.
2014-11-08 00:31:03 -03:00
Peter Eisentraut
926f5cea47 pg_basebackup: Adjust tests for long file name issues
Work around accidental test failures because the working directory path
is too long by creating a temporary directory in the (hopefully shorter)
system location, symlinking that to the working directory, and creating
the tablespaces using the shorter path.
2014-11-07 20:47:38 -05:00
Robert Haas
0b03e5951b Introduce custom path and scan providers.
This allows extension modules to define their own methods for
scanning a relation, and get the core code to use them.  It's
unclear as yet how much use this capability will find, but we
won't find out if we never commit it.

KaiGai Kohei, reviewed at various times and in various levels
of detail by Shigeru Hanada, Tom Lane, Andres Freund, Álvaro
Herrera, and myself.
2014-11-07 17:34:36 -05:00
Heikki Linnakangas
7250d8535b Fix building with WAL_DEBUG.
Now that the backup blocks are appended to the WAL record in xloginsert.c,
XLogInsert doesn't see them anymore and cannot remove them from the version
reconstructed for xlog_outdesc. This makes running with wal_debug=on more
expensive, as we now make (unnecessary) temporary copies of the backup
blocks, but it doesn't seem worth convoluting the code to keep that
optimization.

Reported by Alvaro Herrera.
2014-11-07 23:09:31 +02:00
Robert Haas
5ea86e6e65 Use the sortsupport infrastructure in more cases.
This removes some fmgr overhead from cases such as btree index builds.

Peter Geoghegan, reviewed by Andreas Karlsson and me.
2014-11-07 15:50:55 -05:00
Alvaro Herrera
0e892e04ef Fix serial schedule
Test misc depends on brin, but it was earlier in the serial schedule
file.  I didn't notice this because I only run the parallel schedule,
but the buildfarm exposed my folly ...
2014-11-07 17:08:16 -03:00
Alvaro Herrera
7516f52594 BRIN: Block Range Indexes
BRIN is a new index access method intended to accelerate scans of very
large tables, without the maintenance overhead of btrees or other
traditional indexes.  They work by maintaining "summary" data about
block ranges.  Bitmap index scans work by reading each summary tuple and
comparing them with the query quals; all pages in the range are returned
in a lossy TID bitmap if the quals are consistent with the values in the
summary tuple, otherwise not.  Normal index scans are not supported
because these indexes do not store TIDs.

As new tuples are added into the index, the summary information is
updated (if the block range in which the tuple is added is already
summarized) or not; in the latter case, a subsequent pass of VACUUM or
the brin_summarize_new_values() function will create the summary
information.

For data types with natural 1-D sort orders, the summary info consists
of the maximum and the minimum values of each indexed column within each
page range.  This type of operator class we call "Minmax", and we
supply a bunch of them for most data types with B-tree opclasses.
Since the BRIN code is generalized, other approaches are possible for
things such as arrays, geometric types, ranges, etc; even for things
such as enum types we could do something different than minmax with
better results.  In this commit I only include minmax.

Catalog version bumped due to new builtin catalog entries.

There's more that could be done here, but this is a good step forwards.

Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera,
with contribution by Heikki Linnakangas.

Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas.
Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo.

PS:
  The research leading to these results has received funding from the
  European Union's Seventh Framework Programme (FP7/2007-2013) under
  grant agreement n° 318633.
2014-11-07 16:38:14 -03:00
Heikki Linnakangas
1961b1c131 Fix generation of SP-GiST vacuum WAL records.
I broke these in 8776faa81c. Backpatch to
9.4, where that was done.
2014-11-07 21:17:46 +02:00
Heikki Linnakangas
2effb72e68 Remove obsolete cases from GiST update redo code.
The code that generated a record to clear the F_TUPLES_DELETED flag hasn't
existed since we got rid of old-style VACUUM FULL. I kept the code that sets
the flag, although it's not used for anything anymore, because it might
still be interesting information for debugging purposes that some tuples
have been deleted from a page.

Likewise, the code to turn the root page from non-leaf to leaf page was
removed when we got rid of old-style VACUUM FULL. Remove the code to replay
that action, too.
2014-11-07 15:13:02 +02:00
Tom Lane
d6e37b35cd Cope with more than 64K phrases in a thesaurus dictionary.
dict_thesaurus stored phrase IDs in uint16 fields, so it would get confused
and even crash if there were more than 64K entries in the configuration
file.  It turns out to be basically free to widen the phrase IDs to uint32,
so let's just do so.

This was complained of some time ago by David Boutin (in bug #7793);
he later submitted an informal patch but it was never acted on.
We now have another complaint (bug #11901 from Luc Ouellette) so it's
time to make something happen.

This is basically Boutin's patch, but for future-proofing I also added a
defense against too many words per phrase.  Note that we don't need any
explicit defense against overflow of the uint32 counters, since before that
happens we'd hit array allocation sizes that repalloc rejects.

Back-patch to all supported branches because of the crash risk.
2014-11-06 20:52:40 -05:00
Tom Lane
4875931938 Fix normalization of numeric values in JSONB GIN indexes.
The default JSONB GIN opclass (jsonb_ops) converts numeric data values
to strings for storage in the index.  It must ensure that numeric values
that would compare equal (such as 12 and 12.00) produce identical strings,
else index searches would have behavior different from regular JSONB
comparisons.  Unfortunately the function charged with doing this was
completely wrong: it could reduce distinct numeric values to the same
string, or reduce equivalent numeric values to different strings.  The
former type of error would only lead to search inefficiency, but the
latter type of error would cause index entries that should be found by
a search to not be found.

Repairing this bug therefore means that it will be necessary for 9.4 beta
testers to reindex GIN jsonb_ops indexes, if they care about getting
correct results from index searches involving numeric data values within
the comparison JSONB object.

Per report from Thomas Fanghaenel.
2014-11-06 11:41:06 -05:00
Fujii Masao
5332b8cec5 Prevent the unnecessary creation of .ready file for the timeline history file.
Previously .ready file was created for the timeline history file at the end
of an archive recovery even when WAL archiving was not enabled.
This creation is unnecessary and causes .ready file to remain infinitely.

This commit changes an archive recovery so that it creates .ready file for
the timeline history file only when WAL archiving is enabled.

Backpatch to all supported versions.
2014-11-06 21:24:40 +09:00
Heikki Linnakangas
2076db2aea Move the backup-block logic from XLogInsert to a new file, xloginsert.c.
xlog.c is huge, this makes it a little bit smaller, which is nice. Functions
related to putting together the WAL record are in xloginsert.c, and the
lower level stuff for managing WAL buffers and such are in xlog.c.

Also move the definition of XLogRecord to a separate header file. This
causes churn in the #includes of all the files that write WAL records, and
redo routines, but it avoids pulling in xlog.h into most places.

Reviewed by Michael Paquier, Alvaro Herrera, Andres Freund and Amit Kapila.
2014-11-06 13:55:36 +02:00
Fujii Masao
d2b8a2c7ec Fix typo in comment.
Etsuro Fujita
2014-11-06 20:04:11 +09:00
Fujii Masao
08309aaf74 Implement IF NOT EXIST for CREATE INDEX.
Fabrízio de Royes Mello, reviewed by Marti Raudsepp, Adam Brightwell and me.
2014-11-06 18:48:33 +09:00
Bruce Momjian
171c377a0a C comment: mention why the Gregorian calendar is used pre-1582 2014-11-06 02:33:05 -05:00
Tom Lane
525a489915 Remove the last vestige of server-side autocommit.
Long ago we briefly had an "autocommit" GUC that turned server-side
autocommit on and off.  That behavior was removed in 7.4 after concluding
that it broke far too much client-side logic, and making clients cope with
both behaviors was impractical.  But the GUC variable was left behind, so
as not to break any client code that might be trying to read its value.
Enough time has now passed that we should remove the GUC completely.
Whatever vestigial backwards-compatibility benefit it had is outweighed by
the risk of confusion for newbies who assume it ought to do something,
as per a recent complaint from Wolfgang Wilhelm.

In passing, adjust what seemed to me a rather confusing documentation
reference to libpq's autocommit behavior.  libpq as such knows nothing
about autocommit, so psql is probably what was meant.
2014-11-05 19:35:23 -05:00
Robert Haas
c30be9787b Fix thinko in commit 2bd9e412f9.
Obviously, every translation unit should not be declaring this
separately.  It needs to be PGDLLIMPORT as well, to avoid breaking
third-party code that uses any of the functions that the commit
mentioned above changed to macros.
2014-11-05 17:12:23 -05:00
Tom Lane
465d7e1882 Make CREATE TYPE print warnings if a datatype's I/O functions are volatile.
This is a followup to commit 43ac12c6e6,
which added regression tests checking that I/O functions of built-in
types are not marked volatile.  Complaining in CREATE TYPE should push
developers of add-on types to fix any misdeclared functions in their
types.  It's just a warning not an error, to avoid creating upgrade
problems for what might be just cosmetic mis-markings.

Aside from adding the warning code, fix a number of types that were
sloppily created in the regression tests.
2014-11-05 11:44:06 -05:00
Tom Lane
33f80f8480 Drop no-longer-needed buffers during ALTER DATABASE SET TABLESPACE.
The previous coding assumed that we could just let buffers for the
database's old tablespace age out of the buffer arena naturally.
The folly of that is exposed by bug #11867 from Marc Munro: the user could
later move the database back to its original tablespace, after which any
still-surviving buffers would match lookups again and appear to contain
valid data.  But they'd be missing any changes applied while the database
was in the new tablespace.

This has been broken since ALTER SET TABLESPACE was introduced, so
back-patch to all supported branches.
2014-11-04 13:24:06 -05:00
Heikki Linnakangas
5028f22f6e Switch to CRC-32C in WAL and other places.
The old algorithm was found to not be the usual CRC-32 algorithm, used by
Ethernet et al. We were using a non-reflected lookup table with code meant
for a reflected lookup table. That's a strange combination that AFAICS does
not correspond to any bit-wise CRC calculation, which makes it difficult to
reason about its properties. Although it has worked well in practice, seems
safer to use a well-known algorithm.

Since we're changing the algorithm anyway, we might as well choose a
different polynomial. The Castagnoli polynomial has better error-correcting
properties than the traditional CRC-32 polynomial, even if we had
implemented it correctly. Another reason for picking that is that some new
CPUs have hardware support for calculating CRC-32C, but not CRC-32, let
alone our strange variant of it. This patch doesn't add any support for such
hardware, but a future patch could now do that.

The old algorithm is kept around for tsquery and pg_trgm, which use the
values in indexes that need to remain compatible so that pg_upgrade works.
While we're at it, share the old lookup table for CRC-32 calculation
between hstore, ltree and core. They all use the same table, so might as
well.
2014-11-04 11:39:48 +02:00
Heikki Linnakangas
404bc51cde Remove support for 64-bit CRC.
It hasn't been used for anything for a long time.
2014-11-04 11:33:08 +02:00
Robert Haas
585e0b9b27 pqmq.h needs to include something that defines StringInfo.
Reported by Peter Eisentraut.
2014-11-03 12:24:42 -05:00
Noah Misch
c40212baf6 Clarify .def file comments. 2014-11-02 21:43:33 -05:00
Noah Misch
00c07e497f Re-remove dependency on the DLL of pythonxx.def file.
The reasons behind commit 0d147e43ad still
stand, so this reverts the non-cosmetic portion of commit
a7983e989d.  Back-patch to 9.4, where the
latter commit first appeared.
2014-11-02 21:43:30 -05:00
Noah Misch
67a4120494 Make ECPG test programs depend on "ecpg$(X)", not "ecpg".
Cygwin builds require this of dependencies pertaining to pattern rules.
On Cygwin, stat("foo") in the absence of a file with that exact name can
locate foo.exe.  While GNU make uses stat() for dependencies of ordinary
rules, it uses readdir() to assess dependencies of pattern rules.
Therefore, a pattern rule dependency should match any underlying file
name exactly.  Back-patch to 9.4, where the dependency was introduced.
2014-11-02 21:43:25 -05:00
Noah Misch
8463195217 Fix win32setlocale.c const-related warnings.
Back-patch to 9.2, like commit db29620d4d.
2014-11-02 21:43:20 -05:00
Peter Eisentraut
a409b464f9 Add configure --enable-tap-tests option
Don't skip the TAP tests anymore when IPC::Run is not found.  This will
fail normally now.
2014-11-02 09:17:26 -05:00
Robert Haas
2bd9e412f9 Support frontend-backend protocol communication using a shm_mq.
A background worker can use pq_redirect_to_shm_mq() to direct protocol
that would normally be sent to the frontend to a shm_mq so that another
process may read them.

The receiving process may use pq_parse_errornotice() to parse an
ErrorResponse or NoticeResponse from the background worker and, if
it wishes, ThrowErrorData() to propagate the error (with or without
further modification).

Patch by me.  Review by Andres Freund.
2014-10-31 12:02:40 -04:00
Robert Haas
f7102b0463 Extend dsm API with a new function dsm_unpin_mapping.
This reassociates a dynamic shared memory handle previous passed to
dsm_pin_mapping with the current resource owner, so that it will be
cleaned up at the end of the current query.

Patch by me.  Review of the function name by Andres Freund, Amit
Kapila, Jim Nasby, Petr Jelinek, and Álvaro Herrera.
2014-10-30 14:55:23 -04:00
Tom Lane
fd0f651a86 Test IsInTransactionChain, not IsTransactionBlock, in vac_update_relstats.
As noted by Noah Misch, my initial cut at fixing bug #11638 didn't cover
all cases where ANALYZE might be invoked in an unsafe context.  We need to
test the result of IsInTransactionChain not IsTransactionBlock; which is
notationally a pain because IsInTransactionChain requires an isTopLevel
flag, which would have to be passed down through several levels of callers.
I chose to pass in_outer_xact (ie, the result of IsInTransactionChain)
rather than isTopLevel per se, as that seemed marginally more apropos
for the intermediate functions to know about.
2014-10-30 13:04:06 -04:00
Robert Haas
6057c212f3 "Pin", rather than "keep", dynamic shared memory mappings and segments.
Nobody seemed concerned about this naming when it originally went in,
but there's a pending patch that implements the opposite of
dsm_keep_mapping, and the term "unkeep" was judged unpalatable.
"unpin" has existing precedent in the PostgreSQL code base, and the
English language, so use this terminology instead.

Per discussion, back-patch to 9.4.
2014-10-30 11:35:55 -04:00
Peter Eisentraut
7912f9b7dc Remove use of TAP subtests
They turned out to be too much of a portability headache, because they
need a fairly new version of Test::More to work properly.
2014-10-29 19:41:19 -04:00
Tom Lane
e0722d9cb5 Avoid corrupting tables when ANALYZE inside a transaction is rolled back.
VACUUM and ANALYZE update the target table's pg_class row in-place, that is
nontransactionally.  This is OK, more or less, for the statistical columns,
which are mostly nontransactional anyhow.  It's not so OK for the DDL hint
flags (relhasindex etc), which might get changed in response to
transactional changes that could still be rolled back.  This isn't a
problem for VACUUM, since it can't be run inside a transaction block nor
in parallel with DDL on the table.  However, we allow ANALYZE inside a
transaction block, so if the transaction had earlier removed the last
index, rule, or trigger from the table, and then we roll back the
transaction after ANALYZE, the table would be left in a corrupted state
with the hint flags not set though they should be.

To fix, suppress the hint-flag updates if we are InTransactionBlock().
This is safe enough because it's always OK to postpone hint maintenance
some more; the worst-case consequence is a few extra searches of pg_index
et al.  There was discussion of instead using a transactional update,
but that would change the behavior in ways that are not all desirable:
in most scenarios we're better off keeping ANALYZE's statistical values
even if the ANALYZE itself rolls back.  In any case we probably don't want
to change this behavior in back branches.

Per bug #11638 from Casey Shobe.  This has been broken for a good long
time, so back-patch to all supported branches.

Tom Lane and Michael Paquier, initial diagnosis by Andres Freund
2014-10-29 18:12:02 -04:00
Robert Haas
6cb4afff33 Avoid setup work for invalidation messages at start-of-(sub)xact.
Instead of initializing a new TransInvalidationInfo for every
transaction or subtransaction, we can just do it for those
transactions or subtransactions that actually need to queue
invalidation messages.  That also avoids needing to free those
entries at the end of a transaction or subtransaction that does
not generate any invalidation messages, which is by far the
common case.

Patch by me.  Review by Simon Riggs and Andres Freund.
2014-10-29 12:35:19 -04:00
Heikki Linnakangas
8f8314b560 Reset error message at PQreset()
If you call PQreset() repeatedly, and the connection cannot be
re-established, the error messages from the failed connection attempts
kept accumulating in the error string.

Fixes bug #11455 reported by Caleb Epstein. Backpatch to all supported
versions.
2014-10-29 14:34:43 +02:00
Tom Lane
a00d468e65 Remove obsolete commentary.
Since we got rid of non-MVCC catalog scans, the fourth reason given for
using a non-transactional update in index_update_stats() is obsolete.
The other three are still good, so we're not going to change the code,
but fix the comment.
2014-10-28 18:36:02 -04:00
Heikki Linnakangas
18f158ef69 Remove unnecessary assignment.
Reported by MauMau.
2014-10-28 20:26:20 +02:00
Noah Misch
c0e190365b MinGW: Include .dll extension in .def file LIBRARY commands.
Newer toolchains append the extension implicitly if missing, but
buildfarm member narwhal (gcc 3.4.2, ld 2.15.91 20040904) does not.
This affects most core libraries having an exports.txt file, namely
libpq and the ECPG support libraries.  On Windows Server 2003, Windows
API functions that load and unload DLLs internally will mistakenly
unload a libpq whose DLL header reports "LIBPQ" instead of "LIBPQ.dll".
When, subsequently, control would return to libpq, the backend crashes.
Back-patch to 9.4, like commit 846e91e022.
Before that commit, we used a different linking technique that yielded
"libpq.dll" in the DLL header.

Commit 53566fc094 worked around this by
eliminating a call to a function that loads and unloads DLLs internally.
That commit is no longer necessary for correctness, but its improving
consistency with the MSVC build remains valid.
2014-10-27 19:59:39 -04:00
Heikki Linnakangas
22926e00f7 Fix two bugs in tsquery @> operator.
1. The comparison for matching terms used only the CRC to decide if there's
a match. Two different terms with the same CRC gave a match.

2. It assumed that if the second operand has more terms than the first, it's
never a match. That assumption is bogus, because there can be duplicate
terms in either operand.

Rewrite the implementation in a way that doesn't have those bugs.

Backpatch to all supported versions.
2014-10-27 10:50:41 +02:00