plpgsql likes to cache query plans and simple-expression execution state
trees across calls. This is a considerable win for multiple executions
of the same function. However, it's useless for DO blocks, since by
definition those are executed only once and discarded. Nonetheless,
we were allowing a DO block's expression execution trees to survive
until end of transaction, resulting in a significant intra-transaction
memory leak, as reported by Yeb Havinga. Worse, if the DO block exited
with an error, the compiled form of the block's code was leaked till
end of session --- along with subsidiary plancache entries.
To fix, make DO blocks keep their expression execution trees in a private
EState that's deleted at exit from the block, and add a PG_TRY block
to plpgsql_inline_handler to make sure that memory cleanup happens
even on error exits. Also add a regression test covering error handling
in a DO block, because my first try at this broke that. (The test is
not meant to prove that we don't leak memory anymore, though it could
be used for that with a much larger loop count.)
Ideally we'd back-patch this into all versions supporting DO blocks;
but the patch needs to add a field to struct PLpgSQL_execstate, and that
would break ABI compatibility for third-party plugins such as the plpgsql
debugger. Given the small number of complaints so far, fixing this in
HEAD only seems like an acceptable choice.
This option provides more detailed error messages when STRICT is used
and the number of rows returned is not one.
Marko Tiikkaja, reviewed by Ian Lawrence Barwick
As pointed out by Tom Lane, we can allow other users of the error
handler callbacks to provide their own memory context by adding
the context to use to ErrorData and using that instead of explicitly
using ErrorContext.
This then allows GetErrorContextStack() to be called from inside
exception handlers, so modify plpgsql to take advantage of that and
add an associated regression test for it.
plpgsql often just remembers SPI-result tuple tables in local variables,
and has no mechanism for freeing them if an ereport(ERROR) causes an escape
out of the execution function whose local variable it is. In the original
coding, that wasn't a problem because the tuple table would be cleaned up
when the function's SPI context went away during transaction abort.
However, once plpgsql grew the ability to trap exceptions, repeated
trapping of errors within a function could result in significant
intra-function-call memory leakage, as illustrated in bug #8279 from
Chad Wagner.
We could fix this locally in plpgsql with a bunch of PG_TRY/PG_CATCH
coding, but that would be tedious, probably slow, and prone to bugs of
omission; moreover it would do nothing for similar risks elsewhere.
What seems like a better plan is to make SPI itself responsible for
freeing tuple tables at subtransaction abort. This patch attacks the
problem that way, keeping a list of live tuple tables within each SPI
function context. Currently, such freeing is automatic for tuple tables
made within the failed subtransaction. We might later add a SPI call to
mark a tuple table as not to be freed this way, allowing callers to opt
out; but until someone exhibits a clear use-case for such behavior, it
doesn't seem worth bothering.
A very useful side-effect of this change is that SPI_freetuptable() can
now defend itself against bad calls, such as duplicate free requests;
this should make things more robust in many places. (In particular,
this reduces the risks involved if a third-party extension contains
now-redundant SPI_freetuptable() calls in error cleanup code.)
Even though the leakage problem is of long standing, it seems imprudent
to back-patch this into stable branches, since it does represent an API
semantics change for SPI users. We'll patch this in 9.3, but live with
the leakage in older branches.
This adds the ability to get the call stack as a string from within a
PL/PgSQL function, which can be handy for logging to a table, or to
include in a useful message to an end-user.
Pavel Stehule, reviewed by Rushabh Lathia and rather heavily whacked
around by Stephen Frost.
Specifically, permit attaching them to the error in RAISE and retrieving
them from a caught error in GET STACKED DIAGNOSTICS. RAISE enforces
nothing about the content of the fields; for its purposes, they are just
additional string fields. Consequently, clarify in the protocol and
libpq documentation that the usual relationships between error fields,
like a schema name appearing wherever a table name appears, are not
universal. This freedom has other applications; consider a FDW
propagating an error from an RDBMS having no schema support.
Back-patch to 9.3, where core support for the error fields was
introduced. This prevents the confusion of having a release where libpq
exposes the fields and PL/pgSQL does not.
Pavel Stehule, lexical revisions by Noah Misch.
A materialized view has a rule just like a view and a heap and
other physical properties like a table. The rule is only used to
populate the table, references in queries refer to the
materialized data.
This is a minimal implementation, but should still be useful in
many cases. Currently data is only populated "on demand" by the
CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW statements.
It is expected that future releases will add incremental updates
with various timings, and that a more refined concept of defining
what is "fresh" data will be developed. At some point it may even
be possible to have queries use a materialized in place of
references to underlying tables, but that requires the other
above-mentioned features to be working first.
Much of the documentation work by Robert Haas.
Review by Noah Misch, Thom Brown, Robert Haas, Marko Tiikkaja
Security review by KaiGai Kohei, with a decision on how best to
implement sepgsql still pending.
Currently it's only possible for loadable modules to get control during
post-commit cleanup of a transaction. That doesn't work too well if they
want to do something that could throw an error; for example, an FDW might
need to issue a remote commit, which could well fail. To improve matters,
extend the existing APIs for XactCallback and SubXactCallback functions
to provide new pre-commit events for this purpose.
The release notes will need to mention that existing callback functions
should be checked to make sure they don't do something unwanted when one
of the new event types occurs. In the examples within our source tree,
contrib/sepgsql was fine but plpgsql had been a bit too cute.
exec_simple_check_plan and exec_eval_simple_expr attempted to call
GetCachedPlan directly. This meant that if an error was thrown during
planning, the resulting context traceback would not include the line
normally contributed by _SPI_error_callback. This is already inconsistent,
but just to be really odd, a re-execution of the very same expression
*would* show the additional context line, because we'd already have cached
the plan and marked the expression as non-simple.
The problem is easy to demonstrate in 9.2 and HEAD because planning of a
cached plan doesn't occur at all until GetCachedPlan is done. In earlier
versions, it could only be an issue if initial planning had succeeded, then
a replan was forced (already somewhat improbable for a simple expression),
and the replan attempt failed. Since the issue is mainly cosmetic in older
branches anyway, it doesn't seem worth the risk of trying to fix it there.
It is worth fixing in 9.2 since the instability of the context printout can
affect the results of GET STACKED DIAGNOSTICS, as per a recent discussion
on pgsql-novice.
To fix, introduce a SPI function that wraps GetCachedPlan while installing
the correct callback function. Use this instead of calling GetCachedPlan
directly from plpgsql.
Also introduce a wrapper function for extracting a SPI plan's
CachedPlanSource list. This lets us stop including spi_priv.h in
pl_exec.c, which was never a very good idea from a modularity standpoint.
In passing, fix a similar inconsistency that could occur in SPI_cursor_open,
which was also calling GetCachedPlan without setting up a context callback.
For some reason lost in the mists of prehistory, RETURN was only coded to
allow a simple reference to a composite variable when the function's return
type is composite. Allow an expression instead, while preserving the
efficiency of the original code path in the case where the expression is
indeed just a composite variable's name. Likewise for RETURN NEXT.
As is true in various other places, the supplied expression must yield
exactly the number and data types of the required columns. There was some
discussion of relaxing that for pl/pgsql, but no consensus yet, so this
patch doesn't address that.
Asif Rehman, reviewed by Pavel Stehule
Numerous flex and bison make rules have appeared in the source tree
over time, and they are all virtually identical, so we can replace
them by pattern rules with some variables for customization.
Users of pgxs will also be able to benefit from this.
This makes the naming inside plpgsql consistent and distinguishes the
file from the backend's gram.y file. It will also allow easier
refactoring of the bison make rules later on.
There were assorted places where unreserved keywords were not treated the
same as T_WORD (that is, a random unrecognized identifier). Fix them.
It might not always be possible to allow this, but it is in all these
places, so I don't see any downside.
Per gripe from Jim Wilson. Arguably this is a bug fix, but given the lack
of other complaints and the ease of working around it (just quote the
word), I won't risk back-patching.
This reduces unnecessary exposure of other headers through htup.h, which
is very widely included by many files.
I have chosen to move the function prototypes to the new file as well,
because that means htup.h no longer needs to include tupdesc.h. In
itself this doesn't have much effect in indirect inclusion of tupdesc.h
throughout the tree, because it's also required by execnodes.h; but it's
something to explore in the future, and it seemed best to do the htup.h
change now while I'm busy with it.
Antique versions of gcc complain about vars that are initialized outside
PG_TRY and then modified within it. Rather than marking the var volatile,
expend one more line of code.
Commit 3a0e4d36eb arranged to
reference stack-allocated variables after they were out of scope.
That's no good, so let's arrange to not do that after all.
Commit 3855968f32 added syntax, pg_dump,
psql support, and documentation, but the triggers didn't actually fire.
With this commit, they now do. This is still a pretty basic facility
overall because event triggers do not get a whole lot of information
about what the user is trying to do unless you write them in C; and
there's still no option to fire them anywhere except at the very
beginning of the execution sequence, but it's better than nothing,
and a good building block for future work.
Along the way, add a regression test for ALTER LARGE OBJECT, since
testing of event triggers reveals that we haven't got one.
Dimitri Fontaine and Robert Haas
The Solaris Studio compiler warns about these instances, unlike more
mainstream compilers such as gcc. But manual inspection showed that
the code is clearly not reachable, and we hope no worthy compiler will
complain about removing this code.
The header file is needed by any module that wants to use the PL/pgSQL
instrumentation plugin interface. Most notably, the pldebugger plugin needs
this. With this patch, it can be built using pgxs, without having the full
server source tree available.
The parser got confused if a cursor parameter had the same name as
a plpgsql variable. Reported and diagnosed by Yeb Havinga, though
this isn't exactly his proposed fix.
Also, some mostly-but-not-entirely-cosmetic adjustments to the original
named-cursor-parameter patch, for code readability and better error
diagnostics.
An incorrect and entirely unnecessary "safety check" in exec_stmt_getdiag()
caused the code to treat an assignment to a variable with dno zero as a
no-op. Unfortunately, that's a perfectly valid dno. This has been broken
since GET DIAGNOSTICS was invented. It's not terribly surprising that the
bug went unnoticed for so long, since in most cases you probably wouldn't
use the function's first-created variable (normally its first parameter)
as a GET DIAGNOSTICS target. Nonetheless, it's broken. Per bug #6551
from Adam Buraczewski.
Making this operation look like a utility statement seems generally a good
idea, and particularly so in light of the desire to provide command
triggers for utility statements. The original choice of representing it as
SELECT with an IntoClause appendage had metastasized into rather a lot of
places, unfortunately, so that this patch is a great deal more complicated
than one might at first expect.
In particular, keeping EXPLAIN working for SELECT INTO and CREATE TABLE AS
subcommands required restructuring some EXPLAIN-related APIs. Add-on code
that calls ExplainOnePlan or ExplainOneUtility, or uses
ExplainOneQuery_hook, will need adjustment.
Also, the cases PREPARE ... SELECT INTO and CREATE RULE ... SELECT INTO,
which formerly were accepted though undocumented, are no longer accepted.
The PREPARE case can be replaced with use of CREATE TABLE AS EXECUTE.
The CREATE RULE case doesn't seem to have much real-world use (since the
rule would work only once before failing with "table already exists"),
so we'll not bother with that one.
Both SELECT INTO and CREATE TABLE AS still return a command tag of
"SELECT nnnn". There was some discussion of returning "CREATE TABLE nnnn",
but for the moment backwards compatibility wins the day.
Andres Freund and Tom Lane
Datatype I/O functions are allowed to leak memory in CurrentMemoryContext,
since they are generally called in short-lived contexts. However, plpgsql
calls such functions for purposes of type conversion, and was calling them
in its procedure context. Therefore, any leaked memory would not be
recovered until the end of the plpgsql function. If such a conversion
was done within a loop, quite a bit of memory could get consumed. Fix by
calling such functions in the transient "eval_econtext", and adjust other
logic to match. Back-patch to all supported versions.
Andres Freund, Jan Urbański, Tom Lane
Don't quote the output of format_procedure(); it's already quoted quite
enough. Remove the fn_name field, which was now just dead weight. Fix
remaining expected-output files.
In the previous coding, callers were faced with an awkward choice:
look up the name, do permissions checks, and then lock the table; or
look up the name, lock the table, and then do permissions checks.
The first choice was wrong because the results of the name lookup
and permissions checks might be out-of-date by the time the table
lock was acquired, while the second allowed a user with no privileges
to interfere with access to a table by users who do have privileges
(e.g. if a malicious backend queues up for an AccessExclusiveLock on
a table on which AccessShareLock is already held, further attempts
to access the table will be blocked until the AccessExclusiveLock
is obtained and the malicious backend's transaction rolls back).
To fix, allow callers of RangeVarGetRelid() to pass a callback which
gets executed after performing the name lookup but before acquiring
the relation lock. If the name lookup is retried (because
invalidation messages are received), the callback will be re-executed
as well, so we get the best of both worlds. RangeVarGetRelid() is
renamed to RangeVarGetRelidExtended(); callers not wishing to supply
a callback can continue to invoke it as RangeVarGetRelid(), which is
now a macro. Since the only one caller that uses nowait = true now
passes a callback anyway, the RangeVarGetRelid() macro defaults nowait
as well. The callback can also be used for supplemental locking - for
example, REINDEX INDEX needs to acquire the table lock before the index
lock to reduce deadlock possibilities.
There's a lot more work to be done here to fix all the cases where this
can be a problem, but this commit provides the general infrastructure
and fixes the following specific cases: REINDEX INDEX, REINDEX TABLE,
LOCK TABLE, and and DROP TABLE/INDEX/SEQUENCE/VIEW/FOREIGN TABLE.
Per discussion with Noah Misch and Alvaro Herrera.
The original coding was
var->value = (Datum) state;
which is bogus, and then in commit 2f0f7b4bce
it was "corrected" to
var->value = PointerGetDatum(state);
which is a faithful translation but still wrong.
This seems purely cosmetic, though, so no need for a back-patch.
Pavel Stehule
This seems to have been just an oversight in previous foreign-table work.
A quick grep didn't turn up any other places where RELKIND_FOREIGN_TABLE
was obviously omitted.
One change noted by Alexander Soudakov, the other by me.
Back-patch to 9.1.
The original implementation of ELSIF in plpgsql converted the construct
into nested simple IF statements. This was prone to stack overflow with
long ELSIF lists, in two different ways. First, it's difficult to generate
the parsetree without using right-recursion in the bison grammar, and
that's prone to parser stack overflow since nothing can be reduced until
the whole list has been read. Second, we'd recurse during execution, thus
creating an unnecessary risk of execution-time stack overflow. Rewrite
so that the ELSIF list is represented as a flat list, scanned via iteration
not recursion, and generated through left-recursion in the grammar.
Per a gripe from Håvard Kongsgård.
plpgsql's exec_stmt_execsql was Assert'ing that a CachedPlanSource was
is_valid immediately after exec_prepare_plan. The risk factor in this case
is that after building the prepared statement, exec_prepare_plan calls
exec_simple_check_plan, which might try to generate a generic plan --- and
with CLOBBER_CACHE_ALWAYS or other unusual causes of invalidation, that
could result in an invalidation. However, that path could only be taken
for a SELECT query, for which we need not set mod_stmt. So in this case
I think it's best to just remove the Assert; it's okay to look at a
slightly-stale querytree for what we need here. Per buildfarm testing.
Now that a NULL ParamListInfo pointer causes significantly different
behavior in plancache.c, be sure to pass it that way when the expression
is known not to reference any plpgsql variables. Saves a few setup
cycles anyway.
Rewrite plancache.c so that a "cached plan" (which is rather a misnomer
at this point) can support generation of custom, parameter-value-dependent
plans, and can make an intelligent choice between using custom plans and
the traditional generic-plan approach. The specific choice algorithm
implemented here can probably be improved in future, but this commit is
all about getting the mechanism in place, not the policy.
In addition, restructure the API to greatly reduce the amount of extraneous
data copying needed. The main compromise needed to make that possible was
to split the initial creation of a CachedPlanSource into two steps. It's
worth noting in particular that SPI_saveplan is now deprecated in favor of
SPI_keepplan, which accomplishes the same end result with zero data
copying, and no need to then spend even more cycles throwing away the
original SPIPlan. The risk of long-term memory leaks while manipulating
SPIPlans has also been greatly reduced. Most of this improvement is based
on use of the recently-added MemoryContextSetParent primitive.
walsender.h should depend on xlog.h, not vice versa. (Actually, the
inclusion was circular until a couple hours ago, which was even sillier;
but Bruce broke it in the expedient rather than logically correct
direction.) Because of that poor decision, plus blind application of
pgrminclude, we had a situation where half the system was depending on
xlog.h to include such unrelated stuff as array.h and guc.h. Clean up
the header inclusion, and manually revert a lot of what pgrminclude had
done so things build again.
This episode reinforces my feeling that pgrminclude should not be run
without adult supervision. Inclusion changes in header files in particular
need to be reviewed with great care. More generally, it'd be good if we
had a clearer notion of module layering to dictate which headers can sanely
include which others ... but that's a big task for another day.
This is more SQL-spec-compliant, more easily extensible, and better
performing than the old method of inventing special variables.
Pavel Stehule, reviewed by Shigeru Hanada and David Wheeler
There may be some other places where we should use errdetail_internal,
but they'll have to be evaluated case-by-case. This commit just hits
a bunch of places where invoking gettext is obviously a waste of cycles.
In the previous coding, we would look up a relation in RangeVarGetRelid,
lock the resulting OID, and then AcceptInvalidationMessages(). While
this was sufficient to ensure that we noticed any changes to the
relation definition before building the relcache entry, it didn't
handle the possibility that the name we looked up no longer referenced
the same OID. This was particularly problematic in the case where a
table had been dropped and recreated: we'd latch on to the entry for
the old relation and fail later on. Now, we acquire the relation lock
inside RangeVarGetRelid, and retry the name lookup if we notice that
invalidation messages have been processed meanwhile. Many operations
that would previously have failed with an error in the presence of
concurrent DDL will now succeed.
There is a good deal of work remaining to be done here: many callers
of RangeVarGetRelid still pass NoLock for one reason or another. In
addition, nothing in this patch guards against the possibility that
the meaning of an unqualified name might change due to the creation
of a relation in a schema earlier in the user's search path than the
one where it was previously found. Furthermore, there's nothing at
all here to guard against similar race conditions for non-relations.
For all that, it's a start.
Noah Misch and Robert Haas
The --flag argument can be used to tell xgettext the arguments of
which functions should be flagged with c-format in the PO files,
instead of guessing based on the presence of format specifiers, which
fails if no format specifiers are present but the translation
accidentally introduces one.
Appropriate flag settings have been added for each message catalog.
based on a patch by Christoph Berg for bug #6066
Put gettext trigger words that are common to the backend and backend
modules into a makefile variable to include everywhere, to avoid
error-prone repetitions.
It currently doesn't make a difference, but it's inconsistent with
most other usage, and it might interfere with a future patch, so I'll
change it all in a separate commit.
Also, replace tabs with spaces for alignment.
The core CREATE FUNCTION code only enforces that IN parameter names are
non-duplicate, and that OUT parameter names are separately non-duplicate.
This is because some function languages might not have any confusion
between the two. But in plpgsql, such names are all in the same namespace,
so we'd better disallow it.
Per a recent complaint from Dan S. Not back-patching since this is a small
issue and the change could cause unexpected failures if we started to
enforce it in a minor release.
Historically we didn't do this, even though we had the information, because
plpgsql passed its Params via SPI APIs that only include type OIDs not
typmods. Now that plpgsql uses parser callbacks to create Params, it's
easy to insert the right typmod. This should generally result in lower
surprise factors, because a plpgsql variable that is declared with a typmod
will now work more like a table column with the same typmod. In particular
it's the "right" way to fix bug #6020, in which plpgsql's attempt to return
an anonymous record type is defeated by stricter record-type matching
checks that were added in 9.0. However, it's not impossible that this
could result in subtle behavioral changes that could break somebody's
existing plpgsql code, so I'm afraid to back-patch this change into
released branches. In those branches we'll have to lobotomize the
record-type checks instead.
install-sh can install multiple files at once, so for loops are not
necessary. This was already changed for the rest of the code some
time ago, but pgxs.mk was apparently forgotten, and the obsolete
coding style has now been copied to the PLs as well.
This also fixes the problem that the for loops in question did not
catch errors.
This allows the usual rules for assigning a collation to a local variable
to be overridden. Per discussion, it seems appropriate to support this
rather than forcing all local variables to have the argument-derived
collation.
Since collation is effectively an argument, not a property of the function,
FmgrInfo is really the wrong place for it; and this becomes critical in
cases where a cached FmgrInfo is used for varying purposes that might need
different collation settings. Fix by passing it in FunctionCallInfoData
instead. In particular this allows a clean fix for bug #5970 (record_cmp
not working). This requires touching a bit more code than the original
method, but nobody ever thought that collations would not be an invasive
patch...
This warning is new in gcc 4.6 and part of -Wall. This patch cleans
up most of the noise, but there are some still warnings that are
trickier to remove.
The previous functions of assign hooks are now split between check hooks
and assign hooks, where the former can fail but the latter shouldn't.
Aside from being conceptually clearer, this approach exposes the
"canonicalized" form of the variable value to guc.c without having to do
an actual assignment. And that lets us fix the problem recently noted by
Bernd Helmle that the auto-tune patch for wal_buffers resulted in bogus
log messages about "parameter "wal_buffers" cannot be changed without
restarting the server". There may be some speed advantage too, because
this design lets hook functions avoid re-parsing variable values when
restoring a previous state after a rollback (they can store a pre-parsed
representation of the value instead). This patch also resolves a
longstanding annoyance about custom error messages from variable assign
hooks: they should modify, not appear separately from, guc.c's own message
about "invalid parameter value".
This fixes the gripe I made a few months ago about DO blocks getting
slower with repeated use. At least, it fixes it for the case where
the DO block isn't aborted by an error. We could try running
plpgsql_free_function_memory() even during error exit, but that seems
a bit scary since it makes a lot of presumptions about the data
structures being in good shape. It's probably reasonable to assume
that repeated failures of DO blocks isn't a performance-critical case.
Make plpgsql treat the input collation as a polymorphism variable, so
that we cache separate plans for each input collation that's used in a
particular session, as per recent discussion. Propagate the input
collation to all collatable input parameters.
I chose to also propagate the input collation to all declared variables of
collatable types, which is a bit more debatable but seems to be necessary
for non-astonishing behavior. (Copying a parameter into a separate local
variable shouldn't result in a change of behavior, for example.) There is
enough infrastructure here to support declaring a collation for each local
variable to override that default, but I thought we should wait to see what
the field demand is before adding such a feature.
In passing, remove exec_get_rec_fieldtype(), which wasn't used anywhere.
Documentation patch to follow.
All expression nodes now have an explicit output-collation field, unless
they are known to only return a noncollatable data type (such as boolean
or record). Also, nodes that can invoke collation-aware functions store
a separate field that is the collation value to pass to the function.
This avoids confusion that arises when a function has collatable inputs
and noncollatable output type, or vice versa.
Also, replace the parser's on-the-fly collation assignment method with
a post-pass over the completed expression tree. This allows us to use
a more complex (and hopefully more nearly spec-compliant) assignment
rule without paying for it in extra storage in every expression node.
Fix assorted bugs in the planner's handling of collations by making
collation one of the defining properties of an EquivalenceClass and
by converting CollateExprs into discardable RelabelType nodes during
expression preprocessing.
CollateClause is now used only in raw grammar output, and CollateExpr after
parse analysis. This is for clarity and to avoid carrying collation names
in post-analysis parse trees: that's both wasteful and possibly misleading,
since the collation's name could be changed while the parsetree still
exists.
Also, clean up assorted infelicities and omissions in processing of the
node type.
The initial collations patch treated a COLLATE spec as part of a TypeName,
following what can only be described as brain fade on the part of the SQL
committee. It's a lot more reasonable to treat COLLATE as a syntactically
separate object, so that it can be added in only the productions where it
actually belongs, rather than needing to reject it in a boatload of places
where it doesn't belong (something the original patch mostly failed to do).
In addition this change lets us meet the spec's requirement to allow
COLLATE anywhere in the clauses of a ColumnDef, and it avoids unfriendly
behavior for constructs such as "foo::type COLLATE collation".
To do this, pull collation information out of TypeName and put it in
ColumnDef instead, thus reverting most of the collation-related changes in
parse_type.c's API. I made one additional structural change, which was to
use a ColumnDef as an intermediate node in AT_AlterColumnType AlterTableCmd
nodes. This provides enough room to get rid of the "transform" wart in
AlterTableCmd too, since the ColumnDef can carry the USING expression
easily enough.
Also fix some other minor bugs that have crept in in the same areas,
like failure to copy recently-added fields of ColumnDef in copyfuncs.c.
While at it, document the formerly secret ability to specify a collation
in ALTER TABLE ALTER COLUMN TYPE, ALTER TYPE ADD ATTRIBUTE, and
ALTER TYPE ALTER ATTRIBUTE TYPE; and correct some misstatements about
what the default collation selection will be when COLLATE is omitted.
BTW, the three-parameter form of format_type() should go away too,
since it just contributes to the confusion in this area; but I'll do
that in a separate patch.
This mostly just involves creating control, install, and
update-from-unpackaged scripts for them. However, I had to adjust plperl
and plpython to not share the same support functions between variants,
because we can't put the same function into multiple extensions.
catversion bump forced due to new contents of pg_pltemplate, and because
initdb now installs plpgsql as an extension not a bare language.
Add support for regression testing these as extensions not bare
languages.
Fix a couple of other issues that popped up while testing this: my initial
hack at pg_dump binary-upgrade support didn't work right, and we don't want
an extra schema permissions test after all.
Documentation changes still to come, but I'm committing now to see
whether the MSVC build scripts need work (likely they do).
(I'm not entirely sure that we've finished bikeshedding the syntax details,
but the functionality seems OK.)
Pavel Stehule, reviewed by Stephen Frost and Tom Lane
This adds collation support for columns and domains, a COLLATE clause
to override it per expression, and B-tree index support.
Peter Eisentraut
reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch
src/pl/plpgsql/src/plerrcodes.h, src/include/utils/errcodes.h, and a
big chunk of errcodes.sgml are now automatically generated from a single
file, src/backend/utils/errcodes.txt.
Jan Urbański, reviewed by Tom Lane.
Previously reported as ERRCODE_ADMIN_SHUTDOWN, this case is now
reported as ERRCODE_T_R_DATABASE_DROPPED. No message text change.
Unlikely to happen on most servers, so low impact change to allow
session poolers to correctly handle this situation.
Tatsuo Ishii, edits by me, review by Robert Haas
My previous commit, 85cff3ce7f on
2010-12-25, failed to update errcodes.sgml or plerrcodes.h. This patch
corrects that oversight, per a gripe from Tom Lane, and also corrects
a typographical error.
Given a column reference foo.bar, where there is a composite plpgsql
variable foo but it doesn't contain a column bar, the pre-9.0 coding would
immediately throw a "record foo has no field bar" error. In 9.0 the parser
hook instead falls through to let the core parser see if it can resolve the
reference. If not, you get a complaint about "missing FROM-clause entry
for table foo", which while in some sense correct isn't terribly helpful.
Complicate things a bit so that we can throw the old error message if
neither the core parser nor the hook are able to resolve the column
reference, while not changing the behavior in any other case.
Per bug #5757 from Andrey Galkin.
Instead of using ExecPrepareExpr, call ExecInitExpr. The net change here
is that we don't apply expression_planner() to the expression tree. There
is no need to do so, because that tree is extracted from a fully planned
plancache entry, so all the needed work is already done. This reduces
the setup costs by about a factor of 2 according to some simple tests.
Oversight noted while fooling around with the simple-expression code for
previous fix.
In general, expression execution state trees aren't re-entrantly usable,
since functions can store private state information in them.
For efficiency reasons, plpgsql tries to cache and reuse state trees for
"simple" expressions. It can get away with that most of the time, but it
can fail if the state tree is dirty from a previous failed execution (as
in an example from Alvaro) or is being used recursively (as noted by me).
Fix by tracking whether a state tree is in use, and falling back to the
"non-simple" code path if so. This results in a pretty considerable speed
hit when the non-simple path is taken, but the available alternatives seem
even more unpleasant because they add overhead in the simple path. Per
idea from Heikki.
Back-patch to all supported branches.
This patch eliminates various bizarre behaviors caused by sloppy thinking
about the difference between a domain type and its underlying array type.
In particular, the operation of updating one element of such an array
has to be considered as yielding a value of the underlying array type,
*not* a value of the domain, because there's no assurance that the
domain's CHECK constraints are still satisfied. If we're intending to
store the result back into a domain column, we have to re-cast to the
domain type so that constraints are re-checked.
For similar reasons, such a domain can't be blindly matched to an ANYARRAY
polymorphic parameter, because the polymorphic function is likely to apply
array-ish operations that could invalidate the domain constraints. For the
moment, we just forbid such matching. We might later wish to insert an
automatic downcast to the underlying array type, but such a change should
also change matching of domains to ANYELEMENT for consistency.
To ensure that all such logic is rechecked, this patch removes the original
hack of setting a domain's pg_type.typelem field to match its base type;
the typelem will always be zero instead. In those places where it's really
okay to look through the domain type with no other logic changes, use the
newly added get_base_element_type function in place of get_element_type.
catversion bumped due to change in pg_type contents.
Per bug #5717 from Richard Huxton and subsequent discussion.
This patch adds the SQL-standard concept of an INSTEAD OF trigger, which
is fired instead of performing a physical insert/update/delete. The
trigger function is passed the entire old and/or new rows of the view,
and must figure out what to do to the underlying tables to implement
the update. So this feature can be used to implement updatable views
using trigger programming style rather than rule hacking.
In passing, this patch corrects the names of some columns in the
information_schema.triggers view. It seems the SQL committee renamed
them somewhere between SQL:99 and SQL:2003.
Dean Rasheed, reviewed by Bernd Helmle; some additional hacking by me.
Various places were testing TRIGGER_FIRED_BEFORE() where what they really
meant was !TRIGGER_FIRED_AFTER(), or vice versa. This needs to be cleaned
up because there are about to be more than two possible states.
We might want to note this in the 9.1 release notes as something for
trigger authors to double-check.
For consistency's sake I also changed some places that assumed that
TRIGGER_FIRED_FOR_ROW and TRIGGER_FIRED_FOR_STATEMENT are necessarily
mutually exclusive; that's not in immediate danger of breaking, but
it's still sloppier than it should be.
Extracted from Dean Rasheed's patch for triggers on views. I'm committing
this separately since it's an identifiable separate issue, and is the
only reason for the patch to touch most of these particular files.
Aside from being more forgiving, this prevents a rather surprising misbehavior
when the "wrong" order was used: the old code didn't throw a syntax error,
but absorbed the INTO clause into the last USING expression, which then did
strange things downstream.
Intentionally not changing the documentation; we'll continue to advertise
only the "standard" clause order.
Backpatch to 8.4, where the USING clause was added to EXECUTE.
It's not clear if this situation can occur in plpgsql other than via the
EXECUTE USING case Heikki illustrated, which I will shortly close off.
However, ignoring the intoClause if it's there is surely wrong, so let's
patch it for safety.
Backpatch to 8.3, which is as far back as this code has a PlannedStmt
to deal with. There might be another way to make an equivalent test
before that, but since this is just preventing hypothetical bugs,
I'm not going to obsess about it.
pointed out, it would need a 2nd pass after the whole query is processed to
correctly check that an unknown Param is coerced to the same target type
everywhere. Adding the 2nd pass would add a lot more code, which doesn't
seem worth the risk given that there isn't much of a use case for passing
unknown Params in the first place. The code would work without that check,
but it might be confusing and the behavior would be different from the
varparams case.
Instead, just coerce all unknown params in a PL/pgSQL USING clause to text.
That's simple, and is usually what users expect.
Revert the patch in CVS HEAD and master, and backpatch the new solution to
8.4. Unlike the previous solution, this applies easily to 8.4 too.
expressions. We need to deal with this when handling subscripts in an array
assignment, and also when catching an exception. In an Assert-enabled build
these omissions led to Assert failures, but I think in a normal build the
only consequence would be short-term memory leakage; which may explain why
this wasn't reported from the field long ago.
Back-patch to all supported versions. 7.4 doesn't have exceptions, but
otherwise these bugs go all the way back.
Heikki Linnakangas and Tom Lane
can be caught in the same places that could catch an ordinary RAISE ERROR
in the same location. The previous coding insisted on throwing the error
from the block containing the active exception handler; which is arguably
more surprising, and definitely unlike Oracle's behavior.
Not back-patching, since this is a pretty obscure corner case. The risk
of breaking somebody's code in a minor version update seems to outweigh
any possible benefit.
Piyush Newe, reviewed by David Fetter
While this hack arguably has some benefit in terms of making PL/pgsql's
line numbering match the programmer's expectations, it also makes
PL/pgsql inconsistent with the remaining PLs, making it difficult for
clients to reliably determine where the error actually is. On balance,
it seems better to be consistent.
Pavel Stehule
being used in a PL/pgSQL FOR loop is closed was inadequate, as Tom Lane
pointed out. The bug affects FOR statement variants too, because you can
close an implicitly created cursor too by guessing the "<unnamed portal X>"
name created for it.
To fix that, "pin" the portal to prevent it from being dropped while it's
being used in a PL/pgSQL FOR loop. Backpatch all the way to 7.4 which is
the oldest supported version.
of YYSTYPE, and hence returning the wrong answer for cases where a plpgsql
"unreserved keyword" really does conflict with a variable name. Obviously
I didn't test this enough :-(. Per bug #5524 from Peter Gagarinov.
might close the cursor, rendering the Portal pointer to it invalid.
Closing the cursor in the middle of the loop is not a very sensible thing
to do, but we must handle it gracefully and throw an error instead of
crashing.
even when the expression is a query that returns no rows.
So far as I can tell, the only caller that actually fails when a garbage
OID is returned is exec_stmt_case(), which is new in 8.4 --- in all other
cases, we might make a useless trip through casting logic, but we won't
fail since the isnull flag will be set. Hence, backpatch only to 8.4,
just in case there are apps out there that aren't expecting an error to
be thrown if the query returns more or less than one column. (Which seems
unlikely, since the error would be thrown if the query ever did return a
row; but it's possible there's some never-exercised code out there.)
Per report from Mario Splivalo.
section, throw an error message saying explicitly that the label must go
before DECLARE. Per investigation of a recent pgsql-novice question,
this code did not work as intended in any modern PG version, maybe not ever.
Allowing such a thing would only create ambiguity anyway, so it seems better
to remove it than fix it.
Per bug #5352, this helps to provide a useful error message if the user
tries to do something presently unsupported, namely use a rowtype variable
as a member of a multiple-item INTO list.
The purpose of this change is to eliminate the need for every caller
of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists,
GetSysCacheOid, and SearchSysCacheList to know the maximum number
of allowable keys for a syscache entry (currently 4). This will
make it far easier to increase the maximum number of keys in a
future release should we choose to do so, and it makes the code
shorter, too.
Design and review by Tom Lane.
that happens to be composite itself. Per bug #5314 from Oleg Serov.
Backpatch to 8.0 --- 7.4 has got too many other shortcomings in
composite-type support to make this worth worrying about in that branch.
This is the last EXECUTE-like plpgsql statement that was missing
the capability of inserting parameter values via USING.
Pavel Stehule, reviewed by Itagaki Takahiro
generic syntax error, when seeing "foo := something" and foo isn't recognized.
This buys back most of the helpfulness discarded in my previous patch by not
throwing errors when a qualified name appears to match a row variable but the
last component doesn't match any field of the row. It covers other cases
where our error messages left something to be desired, too.
field references in SQL expressions to have RECFIELD datum-array entries at
parse time. If it turns out that the reference is actually to a SQL column,
the RECFIELD entry is useless, but it costs little. This allows us to get rid
of the previous use of FieldSelect applied to a whole-row Param for the record
variable; which was not only slower than a direct RECFIELD reference, but
failed for references to system columns of a trigger's NEW or OLD record.
Per report and fix suggestion from Dean Rasheed.
PL/pgSQL function within an exception handler. Make sure we use the right
resource owner when we create the tuplestore to hold returned tuples.
Simplify tuplestore API so that the caller doesn't need to be in the right
memory context when calling tuplestore_put* functions. tuplestore.c
automatically switches to the memory context used when the tuplestore was
created. Tuplesort was already modified like this earlier. This patch also
removes the now useless MemoryContextSwitch calls from callers.
Report by Aleksei on pgsql-bugs on Dec 22 2009. Backpatch to 8.1, like
the previous patch that broke this.
support any indexable commutative operator, not just equality. Two rows
violate the exclusion constraint if "row1.col OP row2.col" is TRUE for
each of the columns in the constraint.
Jeff Davis, reviewed by Robert Haas
default be "throw error on conflict", as per discussions. The GUC variable
is plpgsql.variable_conflict, with values "error", "use_variable",
"use_column". The behavior can also be specified per-function by inserting
one of
#variable_conflict error
#variable_conflict use_variable
#variable_conflict use_column
at the start of the function body.
The 8.5 release notes will need to mention using "use_variable" to retain
backward-compatible behavior, although we should encourage people to migrate
to the much less mistake-prone "error" setting.
Update the plpgsql documentation to match this and other recent changes.
directly. This was a lot of trouble, but should be worth it in terms of
not having to keep the plpgsql lexer in step with core anymore. In addition
the handling of keywords is significantly better-structured, allowing us to
de-reserve a number of words that plpgsql formerly treated as reserved.
yytext. This is a necessary change if we're going to have a lexer interface
layer that does lookahead, since yytext won't necessarily be in step with
what the grammar thinks is the current token. yylval and yylloc should
be the only side-variables that we need to manage when doing lookahead.
like the core parser's code. In particular, track locations at the character
rather than line level during parsing, allowing many more parse-time error
conditions to be reported with precise error pointers rather than just
"near line N".
Also, exploit the fact that we no longer need to substitute $N for variable
references by making extracted SQL queries and expressions be exact copies
of subranges of the function text, rather than having random whitespace
changes within them. This makes it possible to directly map parse error
positions from the core parser onto positions in the function text, which
lets us report them without the previous kluge of showing the intermediate
internal-query form. (Later it might be good to do that for core
parse-analysis errors too, but this patch is just touching plpgsql's
lexer/parser, not what happens at runtime.)
In passing, make plpgsql's lexer use palloc not malloc.
These changes make plpgsql's parse-time error reports noticeably nicer
(as illustrated by the regression test changes), and will also simplify
the planned removal of plpgsql's separate lexer by reducing the impedance
mismatch between what it does and what the core lexer does.
* Pull the responsibility for %TYPE and %ROWTYPE out of the scanner,
letting read_datatype manage it instead.
* Avoid unnecessary scanner-driven lookups of plpgsql variables in
places where it's not needed, which is actually most of the time;
we do not need it in DECLARE sections nor in text that is a SQL
query or expression.
* Rationalize the set of token types returned by the scanner:
distinguishing T_SCALAR, T_RECORD, T_ROW seems to complicate the grammar
in more places than it simplifies it, so merge these into one
token type T_DATUM; but split T_ERROR into T_DBLWORD and T_TRIPWORD
for clarity and simplicity of later processing.
Some of this will need to be revisited again when we try to make
plpgsql use the core scanner, but this patch gets some of the bigger
stumbling blocks out of the way.
into SQL expressions, to using the newly added parser callback hooks.
This allows us to do the substitutions in a more semantically-aware way:
a variable reference will only be recognized where it can validly go,
ie, a place where a column value or parameter would be legal, instead of
the former behavior that would replace any textual match including
table names and column aliases (leading to syntax errors later on).
A release-note-worthy fine point is that plpgsql variable names that match
fully-reserved words will now need to be quoted.
This commit preserves the former behavior that variable references take
precedence over any possible match to a column name. The infrastructure
is in place to support the reverse precedence or throwing an error on
ambiguity, but those behaviors aren't accessible yet.
Most of the code changes here are associated with making the namespace
data structure persist so that it can be consulted at runtime, instead
of throwing it away at the end of initial function parsing.
The plpgsql scanner is still doing name lookups, but that behavior is
now irrelevant for SQL expressions. A future commit will deal with
removing unnecessary lookups.
behavior, and is so little used that no one has been interested in fixing it.
To ensure that possible uses are covered, remove the ALIAS declaration's
arbitrary restriction that only $n identifiers can be aliased.
(We could alternatively make RENAME act just like ALIAS, but per discussion
having two different ways to do the same thing is probably more confusing than
helpful.)
As proof of concept, modify plpgsql to use the hooks. plpgsql is still
inserting $n symbols textually, but the "back end" of the parsing process now
goes through the ParamRef hook instead of using a fixed parameter-type array,
and then execution only fetches actually-referenced parameters, using a hook
added to ParamListInfo.
Although there's a lot left to be done in plpgsql, this already cures the
"if (TG_OP = 'INSERT' and NEW.foo ...)" problem, as illustrated by the
changed regression test.