Commit graph

756 commits

Author SHA1 Message Date
Andres Freund
b8d7f053c5 Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.

This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.

The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
  function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
  out operation metadata sequentially; including the avoidance of
  nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
  constant re-checks at evaluation time

Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.

The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
  overhead of expression evaluation, by caching state in prepared
  statements.  That'd be helpful in OLTPish scenarios where
  initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
  work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
  been made here too.

The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
  initialization, whereas previously they were done during
  execution. In edge cases this can lead to errors being raised that
  previously wouldn't have been, e.g. a NULL array being coerced to a
  different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
  during expression initialization, previously it was re-built
  every time a domain check was evaluated. For normal queries this
  doesn't change much, but e.g. for plpgsql functions, which caches
  ExprStates, the old set could stick around longer.  The behavior
  around might still change.

Author: Andres Freund, with significant changes by Tom Lane,
	changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-25 14:52:06 -07:00
Robert Haas
61c2e1a95f Improve access to parallel query from procedural languages.
In SQL, the ability to use parallel query was previous contingent on
fcache->readonly_func, which is only set for non-volatile functions;
but the volatility of a function has no bearing on whether queries
inside it can use parallelism.  Remove that condition.

SPI_execute and SPI_execute_with_args always run the plan just once,
though not necessarily to completion.  Given the changes in commit
691b8d5928, it's sensible to pass
CURSOR_OPT_PARALLEL_OK here, so do that.  This improves access to
parallelism for any caller that uses these functions to execute
queries.  Such callers include plperl, plpython, pltcl, and plpgsql,
though it's not the case that they all use these functions
exclusively.

In plpgsql, allow parallel query for plain SELECT queries (as
opposed to PERFORM, which already worked) and for plain expressions
(which probably won't go through the executor at all, because they
will likely be simple expressions, but if they do then this helps).

Rafia Sabih and Robert Haas, reviewed by Dilip Kumar and Amit Kapila

Discussion: http://postgr.es/m/CAOGQiiMfJ+4SQwgG=6CVHWoisiU0+7jtXSuiyXBM3y=A=eJzmg@mail.gmail.com
2017-03-24 14:46:33 -04:00
Robert Haas
f120b614e0 plpgsql: Don't generate parallel plans for RETURN QUERY.
Commit 7aea8e4f2d allowed a parallel
plan to be generated when for a RETURN QUERY or RETURN QUERY EXECUTE
statement in a PL/pgsql block, but that's a bad idea because plplgsql
asks the executor for 50 rows at a time.  That means that we'll always
be running serially a plan that was intended for parallel execution,
which is not a good idea.  Fix by not requesting a parallel plan from
the outset.

Per discussion, back-patch to 9.6.  There is a slight risk that, due
to optimizer error, somebody could have a case where the parallel plan
executed serially is actually faster than the supposedly-best serial
plan, but the consensus seems to be that that's not sufficient
justification for leaving 9.6 unpatched.

Discussion: http://postgr.es/m/CA+TgmoZ_ZuH+auEeeWnmtorPsgc_SmP+XWbDsJ+cWvWBSjNwDQ@mail.gmail.com
Discussion: http://postgr.es/m/CA+TgmobXEhvHbJtWDuPZM9bVSLiTj-kShxQJ2uM5GPDze9fRYA@mail.gmail.com
2017-03-24 12:30:39 -04:00
Peter Eisentraut
f97a028d8e Spelling fixes in code comments
From: Josh Soref <jsoref@gmail.com>
2017-03-14 12:58:39 -04:00
Tom Lane
08da52859a Bring plpgsql into line with header inclusion policy.
We have a project policy that every .c file should start by including
postgres.h, postgres_fe.h, or c.h as appropriate; and then there is no
need for any .h file to explicitly include any of these.  (The core
reason for this policy is to make it easy to verify that pg_config_os.h
is included before any system headers such as <stdio.h>; without that,
we have portability issues on some platforms due to variation in largefile
options across different modules in the backend.  Also, if .h files were
responsible for choosing which of these key headers to include, .h files
that need to be includable in either frontend or backend compiles would be
in trouble.)

plpgsql was blithely ignoring this policy, so whack it upside the head
until it complies.  I also chose to standardize on including plpgsql's
own .h files after all core-system headers that it pulls in.  That
could've been done either way, but this way seems saner.

Discussion: https://postgr.es/m/CAEepm=2zCoeq3QxVwhS5DFeUh=yU6z81pbWMgfOB8OzyiBwxzw@mail.gmail.com
Discussion: https://postgr.es/m/11634.1488932128@sss.pgh.pa.us
2017-03-08 17:21:08 -05:00
Peter Eisentraut
38d103763d Make more use of castNode() 2017-02-21 11:59:09 -05:00
Tom Lane
7afd56c3c6 Use castNode() in a bunch of statement-list-related code.
When I wrote commit ab1f0c822, I really missed the castNode() macro that
Peter E. had proposed shortly before.  This back-fills the uses I would
have put it to.  It's probably not all that significant, but there are
more assertions here than there were before, and conceivably they will
help catch any bugs associated with those representation changes.

I left behind a number of usages like "(Query *) copyObject(query_var)".
Those could have been converted as well, but Peter has proposed another
notational improvement that would handle copyObject cases automatically,
so I let that be for now.
2017-01-26 22:09:34 -05:00
Peter Eisentraut
f21a563d25 Move some things from builtins.h to new header files
This avoids that builtins.h has to include additional header files.
2017-01-20 20:29:53 -05:00
Andres Freund
ea15e18677 Remove obsoleted code relating to targetlist SRF evaluation.
Since 69f4b9c plain expression evaluation (and thus normal projection)
can't return sets of tuples anymore. Thus remove code dealing with
that possibility.

This will require adjustments in external code using
ExecEvalExpr()/ExecProject() - that should neither be hard nor very
common.

Author: Andres Freund and Tom Lane
Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-19 14:40:41 -08:00
Tom Lane
ab1f0c8225 Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements.  It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list.  This patch brings
similar consistency to the outputs of raw parsing and planning steps:

* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.

* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements.  In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node.  This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.

Now, every list of statements has a consistent head-node type depending
on how far along it is in processing.  This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.

Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc.  That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way.  It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)

Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list.  While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.

The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement.  This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)

There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.

Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes.  This allows
more intelligent handling of cases where a source query string contains
multiple statements.  This patch doesn't actually do anything with the
information, but a follow-on patch will.  (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)

catversion bump because addition of location fields to struct Query
affects stored rules.

This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.

Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 16:02:35 -05:00
Bruce Momjian
1d25779284 Update copyright via script for 2017 2017-01-03 13:48:53 -05:00
Tom Lane
55caaaeba8 Improve handling of array elements as getdiag_targets and cursor_variables.
There's no good reason why plpgsql's GET DIAGNOSTICS statement can't
support an array element as target variable, since the execution code
already uses the generic exec_assign_value() function to assign to it.
Hence, refactor the grammar to allow that, by making getdiag_target
depend on the assign_var production.

Ideally we'd also let a cursor_variable expand to an element of a
refcursor[] array, but that's substantially harder since those statements
also have to handle bound-cursor-variable cases.  For now, just make sure
the reported error is sensible, ie "cursor variable must be a simple
variable" not "variable must be of type cursor or refcursor".  The latter
was quite confusing from the user's viewpoint, since what he wrote
satisfies the claimed restriction.

Per bug #14463 from Zhou Digoal.  Given the lack of previous complaints,
I see no need for a back-patch.

Discussion: https://postgr.es/m/20161213152548.14897.81245@wrigleys.postgresql.org
2016-12-13 16:33:03 -05:00
Tom Lane
1833f1a1c3 Simplify code by getting rid of SPI_push, SPI_pop, SPI_restore_connection.
The idea behind SPI_push was to allow transitioning back into an
"unconnected" state when a SPI-using procedure calls unrelated code that
might or might not invoke SPI.  That sounds good, but in practice the only
thing it does for us is to catch cases where a called SPI-using function
forgets to call SPI_connect --- which is a highly improbable failure mode,
since it would be exposed immediately by direct testing of said function.
As against that, we've had multiple bugs induced by forgetting to call
SPI_push/SPI_pop around code that might invoke SPI-using functions; these
are much harder to catch and indeed have gone undetected for years in some
cases.  And we've had to band-aid around some problems of this ilk by
introducing conditional push/pop pairs in some places, which really kind
of defeats the purpose altogether; if we can't draw bright lines between
connected and unconnected code, what's the point?

Hence, get rid of SPI_push[_conditional], SPI_pop[_conditional], and the
underlying state variable _SPI_curid.  It turns out SPI_restore_connection
can go away too, which is a nice side benefit since it was never more than
a kluge.  Provide no-op macros for the deleted functions so as to avoid an
API break for external modules.

A side effect of this removal is that SPI_palloc and allied functions no
longer permit being called when unconnected; they'll throw an error
instead.  The apparent usefulness of the previous behavior was a mirage
as well, because it was depended on by only a few places (which I fixed in
preceding commits), and it posed a risk of allocations being unexpectedly
long-lived if someone forgot a SPI_push call.

Discussion: <20808.1478481403@sss.pgh.pa.us>
2016-11-08 17:39:57 -05:00
Tom Lane
9257f07872 Replace uses of SPI_modifytuple that intend to allocate in current context.
Invent a new function heap_modify_tuple_by_cols() that is functionally
equivalent to SPI_modifytuple except that it always allocates its result
by simple palloc.  I chose however to make the API details a bit more
like heap_modify_tuple: pass a tupdesc rather than a Relation, and use
bool convention for the isnull array.

Use this function in place of SPI_modifytuple at all call sites where the
intended behavior is to allocate in current context.  (There actually are
only two call sites left that depend on the old behavior, which makes me
wonder if we should just drop this function rather than keep it.)

This new function is easier to use than heap_modify_tuple() for purposes
of replacing a single column (or, really, any fixed number of columns).
There are a number of places where it would simplify the code to change
over, but I resisted that temptation for the moment ... everywhere except
in plpgsql's exec_assign_value(); changing that might offer some small
performance benefit, so I did it.

This is on the way to removing SPI_push/SPI_pop, but it seems like
good code cleanup in its own right.

Discussion: <9633.1478552022@sss.pgh.pa.us>
2016-11-08 15:36:44 -05:00
Tom Lane
fc8b81a291 Need to do SPI_push/SPI_pop around expression evaluation in plpgsql.
We must do this in case the expression evaluation results in calling
another plpgsql function (or, really, anything using SPI).  I missed
the need for this when I converted exec_cast_value() from doing a
simple InputFunctionCall() to doing ExecEvalExpr() in commit 1345cc67b.
There is a SPI_push_conditional in InputFunctionCall(), so that there
was no bug before that.

Per bug #14414 from Marcos Castedo.  Add a regression test based on his
example, which was that a plpgsql function in a domain check constraint
didn't work when assigning to a domain-type variable within plpgsql.

Report: <20161106010947.1387.66380@wrigleys.postgresql.org>
2016-11-06 12:09:36 -05:00
Tom Lane
a4c35ea1c2 Improve parser's and planner's handling of set-returning functions.
Teach the parser to reject misplaced set-returning functions during parse
analysis using p_expr_kind, in much the same way as we do for aggregates
and window functions (cf commit eaccfded9).  While this isn't complete
(it misses nesting-based restrictions), it's much better than the previous
error reporting for such cases, and it allows elimination of assorted
ad-hoc expression_returns_set() error checks.  We could add nesting checks
later if it seems important to catch all cases at parse time.

There is one case the parser will now throw error for although previous
versions allowed it, which is SRFs in the tlist of an UPDATE.  That never
behaved sensibly (since it's ill-defined which generated row should be
used to perform the update) and it's hard to see why it should not be
treated as an error.  It's a release-note-worthy change though.

Also, add a new Query field hasTargetSRFs reporting whether there are
any SRFs in the targetlist (including GROUP BY/ORDER BY expressions).
The parser can now set that basically for free during parse analysis,
and we can use it in a number of places to avoid expression_returns_set
searches.  (There will be more such checks soon.)  In some places, this
allows decontorting the logic since it's no longer expensive to check for
SRFs in the tlist --- so I made the checks parallel to the handling of
hasAggs/hasWindowFuncs wherever it seemed appropriate.

catversion bump because adding a Query field changes stored rules.

Andres Freund and Tom Lane

Discussion: <24639.1473782855@sss.pgh.pa.us>
2016-09-13 13:54:24 -04:00
Peter Eisentraut
e0013deb59 Make better use of existing enums in plpgsql
plpgsql.h defines a number of enums, but most of the code passes them
around as ints.  Update structs and function prototypes to take enum
types instead.  This clarifies the struct definitions in plpgsql.h in
particular.

Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
2016-09-09 12:00:00 -04:00
Tom Lane
ea268cdc9a Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters.  While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea.  Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.

While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.

In passing, change TopMemoryContext to use the default allocation
parameters.  Formerly it could only be extended 8K at a time.  That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there.  There seems no good reason
not to let it use growing blocks like most other contexts.

Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain.  The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.

Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 17:50:38 -04:00
Tom Lane
5697522d84 In plpgsql, don't try to convert int2vector or oidvector to expanded array.
These types are storage-compatible with real arrays, but they don't support
toasting, so of course they can't support expansion either.

Per bug #14289 from Michael Overmeyer.  Back-patch to 9.5 where expanded
arrays were introduced.

Report: <20160818174414.1529.37913@wrigleys.postgresql.org>
2016-08-18 14:49:08 -04:00
Peter Eisentraut
9f31e45a6d Improve formatting of comments in plpgsql.h
This file had some unusual comment layout.  Most of the comments
introducing structs ended up to the right of the screen and following
the start of the struct.  Some comments for struct members ended up
after the member definition.

Fix that by moving comments consistently before what they are
describing.  Also add missing struct tags where missing so that it is
easier to tell what the struct is.
2016-08-18 12:00:00 -04:00
Tom Lane
bfaaacc805 Improve plpgsql's memory management to fix some function-lifespan leaks.
In some cases, exiting out of a plpgsql statement due to an error, then
catching the error in a surrounding exception block, led to leakage of
temporary data the statement was working with, because we kept all such
data in the function-lifespan SPI Proc context.  Iterating such behavior
many times within one function call thus led to noticeable memory bloat.

To fix, create an additional memory context meant to have statement
lifespan.  Since many plpgsql statements, particularly the simpler/more
common ones, don't need this, create it only on demand.  Reset this context
at the end of any statement that uses it, and arrange for exception cleanup
to reset it too, thereby fixing the memory-leak issue.  Allow a stack of
such contexts to exist to handle cases where a compound statement needs
statement-lifespan data that persists across calls of inner statements.

While at it, clean up code and improve comments referring to the existing
short-term memory context, which by plpgsql convention is the per-tuple
context of the eval_econtext ExprContext.  We now uniformly refer to that
as the eval_mcontext, whereas the new statement-lifespan memory contexts
are called stmt_mcontext.

This change adds some context-creation overhead, but on the other hand
it allows removal of some retail pfree's in favor of context resets.
On balance it seems to be about a wash performance-wise.

In principle this is a bug fix, but it seems too invasive for a back-patch,
and the infrequency of complaints weighs against taking the risk in the
back branches.  So we'll fix it only in HEAD, at least for now.

Tom Lane, reviewed by Pavel Stehule

Discussion: <17863.1469142152@sss.pgh.pa.us>
2016-08-17 14:51:10 -04:00
Tom Lane
0bb51aa967 Improve parsetree representation of special functions such as CURRENT_DATE.
We implement a dozen or so parameterless functions that the SQL standard
defines special syntax for.  Up to now, that was done by converting them
into more or less ad-hoc constructs such as "'now'::text::date".  That's
messy for multiple reasons: it exposes what should be implementation
details to users, and performance is worse than it needs to be in several
cases.  To improve matters, invent a new expression node type
SQLValueFunction that can represent any of these parameterless functions.

Bump catversion because this changes stored parsetrees for rules.

Discussion: <30058.1463091294@sss.pgh.pa.us>
2016-08-16 20:33:01 -04:00
Peter Eisentraut
34927b2920 Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: cda21c1d7b160b303dc21dfe9d4169f2c8064c60
2016-08-08 11:08:00 -04:00
Tom Lane
baebab3ace Allow IMPORT FOREIGN SCHEMA within pl/pgsql.
Since IMPORT FOREIGN SCHEMA has an INTO clause, pl/pgsql needs to be
aware of that and avoid capturing the INTO as an INTO-variables clause.
This isn't hard, though it's annoying to have to make IMPORT a plpgsql
keyword just for this.  (Fortunately, we have the infrastructure now
to make it an unreserved keyword, so at least this shouldn't break any
existing pl/pgsql code.)

Per report from Merlin Moncure.  Back-patch to 9.5 where IMPORT FOREIGN
SCHEMA was introduced.

Report: <CAHyXU0wpHf2bbtKGL1gtUEFATCY86r=VKxfcACVcTMQ70mCyig@mail.gmail.com>
2016-07-12 18:07:03 -04:00
Tom Lane
1fe1204e87 Add missing check for malloc failure in plpgsql_extra_checks_check_hook().
Per report from Andreas Seltenreich.  Back-patch to affected versions.

Report: <874m8nn0hv.fsf@elite.ansel.ydns.eu>
2016-06-20 15:36:54 -04:00
Peter Eisentraut
47981a4665 Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 0c374f8d25ed31833a10d24252bc928d41438838
2016-06-20 09:48:08 -04:00
Robert Haas
4bc424b968 pgindent run for 9.6 2016-06-09 18:02:36 -04:00
Peter Eisentraut
48aaba4acf Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 17bf3e8564abf600274789fcc90e72532d5e7c05
2016-05-09 10:04:41 -04:00
Tom Lane
23a27b039d Widen query numbers-of-tuples-processed counters to uint64.
This patch widens SPI_processed, EState's es_processed field, PortalData's
portalPos field, FuncCallContext's call_cntr and max_calls fields,
ExecutorRun's count argument, PortalRunFetch's result, and the max number
of rows in a SPITupleTable to uint64, and deals with (I hope) all the
ensuing fallout.  Some of these values were declared uint32 before, and
others "long".

I also removed PortalData's posOverflow field, since that logic seems
pretty useless given that portalPos is now always 64 bits.

The user-visible results are that command tags for SELECT etc will
correctly report tuple counts larger than 4G, as will plpgsql's GET
GET DIAGNOSTICS ... ROW_COUNT command.  Queries processing more tuples
than that are still not exactly the norm, but they're becoming more
common.

Most values associated with FETCH/MOVE distances, such as PortalRun's count
argument and the count argument of most SPI functions that have one, remain
declared as "long".  It's not clear whether it would be worth promoting
those to int64; but it would definitely be a large dollop of additional
API churn on top of this, and it would only help 32-bit platforms which
seem relatively less likely to see any benefit.

Andreas Scherbaum, reviewed by Christian Ullrich, additional hacking by me
2016-03-12 16:05:29 -05:00
Magnus Hagander
6c90996a4c Add prefix to pl/pgsql global variables and functions
Rename pl/pgsql global variables to always have a plpgsql_ prefix,
so they don't conflict with other shared libraries loaded.
2016-03-03 10:45:59 +01:00
Bruce Momjian
ee94300446 Update copyright for 2016
Backpatch certain files through 9.1
2016-01-02 13:33:40 -05:00
Robert Haas
1efc7e5382 Fix problems with ParamListInfo serialization mechanism.
Commit d1b7c1ffe7 introduced a mechanism
for serializing a ParamListInfo structure to be passed to a parallel
worker.  However, this mechanism failed to handle external expanded
values, as pointed out by Noah Misch.  Repair.

Moreover, plpgsql_param_fetch requires adjustment because the
serialization mechanism needs it to skip evaluating unused parameters
just as we would do when it is called from copyParamList, but params
== estate->paramLI in that case.  To fix, make the bms_is_member test
in that function unconditional.

Finally, have setup_param_list set a new ParamListInfo field,
paramMask, to the parameters actually used in the expression, so that
we don't try to fetch those that are not needed when serializing a
parameter list.  This isn't necessary for correctness, but it makes
the performance of the parallel executor code comparable to what we
do for cases involving cursors.

Design suggestions and extensive review by Noah Misch.  Patch by me.
2015-11-02 18:11:29 -05:00
Robert Haas
7aea8e4f2d Determine whether it's safe to attempt a parallel plan for a query.
Commit 924bcf4f16 introduced a framework
for parallel computation in PostgreSQL that makes most but not all
built-in functions safe to execute in parallel mode.  In order to have
parallel query, we'll need to be able to determine whether that query
contains functions (either built-in or user-defined) that cannot be
safely executed in parallel mode.  This requires those functions to be
labeled, so this patch introduces an infrastructure for that.  Some
functions currently labeled as safe may need to be revised depending on
how pending issues related to heavyweight locking under paralllelism
are resolved.

Parallel plans can't be used except for the case where the query will
run to completion.  If portal execution were suspended, the parallel
mode restrictions would need to remain in effect during that time, but
that might make other queries fail.  Therefore, this patch introduces
a framework that enables consideration of parallel plans only when it
is known that the plan will be run to completion.  This probably needs
some refinement; for example, at bind time, we do not know whether a
query run via the extended protocol will be execution to completion or
run with a limited fetch count.  Having the client indicate its
intentions at bind time would constitute a wire protocol break.  Some
contexts in which parallel mode would be safe are not adjusted by this
patch; the default is not to try parallel plans except from call sites
that have been updated to say that such plans are OK.

This commit doesn't introduce any parallel paths or plans; it just
provides a way to determine whether they could potentially be used.
I'm committing it on the theory that the remaining parallel sequential
scan patches will also get committed to this release, hopefully in the
not-too-distant future.

Robert Haas and Amit Kapila.  Reviewed (in earlier versions) by Noah
Misch.
2015-09-16 15:38:47 -04:00
Tom Lane
0426f349ef Rearrange the handling of error context reports.
Remove the code in plpgsql that suppressed the innermost line of CONTEXT
for messages emitted by RAISE commands.  That was never more than a quick
backwards-compatibility hack, and it's pretty silly in cases where the
RAISE is nested in several levels of function.  What's more, it violated
our design theory that verbosity of error reports should be controlled
on the client side not the server side.

To alleviate the resulting noise increase, introduce a feature in libpq
and psql whereby the CONTEXT field of messages can be suppressed, either
always or only for non-error messages.  Printing CONTEXT for errors only
is now their default behavior.

The actual code changes here are pretty small, but the effects on the
regression test outputs are widespread.  I had to edit some of the
alternative expected outputs by hand; hopefully the buildfarm will soon
find anything I fat-fingered.

In passing, fix up (again) the output line counts in psql's various
help displays.  Add some commentary about how to verify them.

Pavel Stehule, reviewed by Petr Jelínek, Jeevan Chalke, and others
2015-09-05 11:58:33 -04:00
Tom Lane
781ed2bfa3 Further tweak wording of error messages about bad CONTINUE/EXIT statements.
Per discussion, a little more verbosity seems called for.
2015-08-25 14:06:13 -04:00
Tom Lane
18391a8f06 Tweak wording of syntax error messages about bad CONTINUE/EXIT statements.
Try to avoid any possible confusion about what these messages mean.
2015-08-23 17:34:47 -04:00
Tom Lane
fcdfce6820 Detect mismatched CONTINUE and EXIT statements at plpgsql compile time.
With a bit of tweaking of the compile namestack data structure, we can
verify at compile time whether a CONTINUE or EXIT is legal.  This is
surely better than leaving it to runtime, both because earlier is better
and because we can issue a proper error pointer.  Also, we can get rid
of the ad-hoc old way of detecting the problem, which only took care of
CONTINUE not EXIT.

Jim Nasby, adjusted a bit by me
2015-08-21 20:17:19 -04:00
Tom Lane
2edb949115 Fix a few bogus statement type names in plpgsql error messages.
plpgsql's error location context messages ("PL/pgSQL function fn-name line
line-no at stmt-type") would misreport a CONTINUE statement as being an
EXIT, and misreport a MOVE statement as being a FETCH.  These are clear
bugs that have been there a long time, so back-patch to all supported
branches.

In addition, in 9.5 and HEAD, change the description of EXECUTE from
"EXECUTE statement" to just plain EXECUTE; there seems no good reason why
this statement type should be described differently from others that have
a well-defined head keyword.  And distinguish GET STACKED DIAGNOSTICS from
plain GET DIAGNOSTICS.  These are a bit more of a judgment call, and also
affect existing regression-test outputs, so I did not back-patch into
stable branches.

Pavel Stehule and Tom Lane
2015-08-18 19:22:37 -04:00
Tom Lane
d3eaab3ef0 Fix performance bug from conflict between two previous improvements.
My expanded-objects patch (commit 1dc5ebc907) included code to make
plpgsql pass expanded-object variables as R/W pointers to certain functions
that are trusted for modifying such variables in-place.  However, that
optimization got broken by commit 6c82d8d1fd, which arranged to share
a single ParamListInfo across most expressions evaluated by a plpgsql
function.  We don't want a R/W pointer to be passed to other functions
just because we decided one function was safe!  Fortunately, the breakage
was in the other direction, of never passing a R/W pointer at all, because
we'd always have pre-initialized the shared array slot with a R/O pointer.
So it was still functionally correct, but we were back to O(N^2)
performance for repeated use of "a := a || x".  To fix, force an unshared
param array to be used when the R/W param optimization is active.

Commit 6c82d8d1fd is in HEAD only, so no need for a back-patch.
2015-08-17 19:39:35 -04:00
Tom Lane
83604cc423 Repair unsafe use of shared typecast-lookup table in plpgsql DO blocks.
DO blocks use private simple_eval_estates to avoid intra-transaction memory
leakage, cf commit c7b849a89.  I had forgotten about that while writing
commit 0fc94a5ba, but it means that expression execution trees created
within a DO block disappear immediately on exiting the DO block, and hence
can't safely be linked into plpgsql's session-wide cast hash table.
To fix, give a DO block a private cast hash table to go with its private
simple_eval_estate.  This is less efficient than one could wish, since
DO blocks can no longer share any cast lookup work with other plpgsql
execution, but it shouldn't be too bad; in any case it's no worse than
what happened in DO blocks before commit 0fc94a5ba.

Per bug #13571 from Feike Steenbergen.  Preliminary analysis by
Oleksandr Shulgin.
2015-08-15 12:00:48 -04:00
Tom Lane
0fc94a5bab Repair mishandling of cached cast-expression trees in plpgsql.
In commit 1345cc67bb, I introduced caching
of expressions representing type-cast operations into plpgsql.  However,
I supposed that I could cache both the expression trees and the evaluation
state trees derived from them for the life of the session.  This doesn't
work, because we execute the expressions in plpgsql's simple_eval_estate,
which has an ecxt_per_query_memory that is only transaction-lifespan.
Therefore we can end up putting pointers into the evaluation state tree
that point to transaction-lifespan memory; in particular this happens if
the cast expression calls a SQL-language function, as reported by Geoff
Winkless.

The minimum-risk fix seems to be to treat the state trees the same way
we do for "simple expression" trees in plpgsql, ie create them in the
simple_eval_estate's ecxt_per_query_memory, which means recreating them
once per transaction.

Since I had to introduce bookkeeping overhead for that anyway, I bought
back some of the added cost by sharing the read-only expression trees
across all functions in the session, instead of using a per-function
table as originally.  The simple-expression bookkeeping takes care of
the recursive-usage risk that I was concerned about avoiding before.

At some point we should take a harder look at how all this works,
and see if we can't reduce the amount of tree reinitialization needed.
But that won't happen for 9.5.
2015-07-17 15:53:09 -04:00
Tom Lane
6c82d8d1fd Further reduce overhead for passing plpgsql variables to the executor.
This builds on commit 21dcda2713 by keeping
a plpgsql function's shared ParamListInfo's entries for simple variables
(PLPGSQL_DTYPE_VARs) valid at all times.  That adds a few cycles to each
assignment to such variables, but saves significantly more cycles each time
they are used; so except in the pathological case of many dead stores, this
should always be a win.  Initial testing says it's good for about a 10%
speedup of simple calculations; more in large functions with many datums.

We can't use this method for row/record references unfortunately, so what
we do for those is reset those ParamListInfo slots after use; which we
can skip doing unless some of them were actually evaluated during the
previous evaluation call.  So this should frequently be a win as well,
while worst case is that it's similar cost to the previous approach.

Also, closer study suggests that the previous method of instantiating a
new ParamListInfo array per evaluation is actually probably optimal for
cursor-opening executor calls.  The reason is that whatever is visible in
the array is going to get copied into the cursor portal via copyParamList.
So if we used the function's main ParamListInfo for those calls, we'd end
up with all of its DTYPE_VAR vars getting copied, which might well include
large pass-by-reference values that the cursor actually has no need for.
To avoid a possible net degradation in cursor cases, go back to creating
and filling a private ParamListInfo in those cases (which therefore will be
exactly the same speed as before 21dcda2713).  We still get some benefit
out of this though, because this approach means that we only have to defend
against copyParamList's try-to-fetch-every-slot behavior in the case of an
unshared ParamListInfo; so plpgsql_param_fetch() can skip testing
expr->paramnos in the common case.

To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is
always valid, all assignments to such variables are now funneled through
assign_simple_var().  But this makes for cleaner and shorter code anyway.
2015-07-05 12:57:17 -04:00
Peter Eisentraut
c5e5d444de Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: fb7e72f46cfafa1b5bfe4564d9686d63a1e6383f
2015-06-28 23:56:55 -04:00
Tom Lane
ae58f1430a Fix failure to cover scalar-vs-rowtype cases in exec_stmt_return().
In commit 9e3ad1aac5 I modified plpgsql
to use exec_stmt_return's simple-variables fast path in more cases.
However, I overlooked that there are really two different return
conventions in use here, depending on whether estate->retistuple is true,
and the existing fast-path code had only bothered to handle one of them.
So trying to return a scalar in a function returning composite, or vice
versa, could lead to unexpected error messages (typically "cache lookup
failed for type 0") or to a null-pointer-dereference crash.

In the DTYPE_VAR case, we can just throw error if retistuple is true,
corresponding to what happens in the general-expression code path that was
being used previously.  (Perhaps someday both of these code paths should
attempt a coercion, but today is not that day.)

In the REC and ROW cases, just hand the problem to exec_eval_datum()
when not retistuple.  Also clean up the ROW coding slightly so it looks
more like exec_eval_datum().

The previous commit also caused exec_stmt_return_next() to be used in
more cases, but that code seems to be OK as-is.

Per off-list report from Serge Rielau.  This bug is new in 9.5 so no need
to back-patch.
2015-06-12 13:44:06 -04:00
Tom Lane
1dc5ebc907 Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation.  Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype.  There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.

The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.

In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables.  We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work.  (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)

Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad.  In any case most applications should see a net win.

Tom Lane, reviewed by Andres Freund
2015-05-14 12:08:49 -04:00
Tom Lane
785941cdc3 Tweak __attribute__-wrapping macros for better pgindent results.
This improves on commit bbfd7edae5 by
making two simple changes:

* pg_attribute_noreturn now takes parentheses, ie pg_attribute_noreturn().
Likewise pg_attribute_unused(), pg_attribute_packed().  This reduces
pgindent's tendency to misformat declarations involving them.

* attributes are now always attached to function declarations, not
definitions.  Previously some places were taking creative shortcuts,
which were not merely candidates for bad misformatting by pgindent
but often were outright wrong anyway.  (It does little good to put a
noreturn annotation where callers can't see it.)  In any case, if
we would like to believe that these macros can be used with non-gcc
compilers, we should avoid gratuitous variance in usage patterns.

I also went through and manually improved the formatting of a lot of
declarations, and got rid of excessively repetitive (and now obsolete
anyway) comments informing the reader what pg_attribute_printf is for.
2015-03-26 14:03:25 -04:00
Tom Lane
a4847fc3ef Add an ASSERT statement in plpgsql.
This is meant to make it easier to insert simple debugging cross-checks
in plpgsql functions.

Pavel Stehule, reviewed by Jim Nasby
2015-03-25 19:05:32 -04:00
Alvaro Herrera
13dbc7a824 array_offset() and array_offsets()
These functions return the offset position or positions of a value in an
array.

Author: Pavel Stěhule
Reviewed by: Jim Nasby
2015-03-18 16:01:34 -03:00
Tom Lane
5ff683962e Remove obsolete comment.
Obsoleted by commit 21dcda2713, but I missed
seeing the cross-reference in the comments for exec_eval_integer().

Also improve the cross-reference in the comments for exec_eval_cleanup().
2015-03-14 17:07:11 -04:00
Tom Lane
c6b3c939b7 Make operator precedence follow the SQL standard more closely.
While the SQL standard is pretty vague on the overall topic of operator
precedence (because it never presents a unified BNF for all expressions),
it does seem reasonable to conclude from the spec for <boolean value
expression> that OR has the lowest precedence, then AND, then NOT, then IS
tests, then the six standard comparison operators, then everything else
(since any non-boolean operator in a WHERE clause would need to be an
argument of one of these).

We were only sort of on board with that: most notably, while "<" ">" and
"=" had properly low precedence, "<=" ">=" and "<>" were treated as generic
operators and so had significantly higher precedence.  And "IS" tests were
even higher precedence than those, which is very clearly wrong per spec.

Another problem was that "foo NOT SOMETHING bar" constructs, such as
"x NOT LIKE y", were treated inconsistently because of a bison
implementation artifact: they had the documented precedence with respect
to operators to their right, but behaved like NOT (i.e., very low priority)
with respect to operators to their left.

Fixing the precedence issues is just a small matter of rearranging the
precedence declarations in gram.y, except for the NOT problem, which
requires adding an additional lookahead case in base_yylex() so that we
can attach a different token precedence to NOT LIKE and allied two-word
operators.

The bulk of this patch is not the bug fix per se, but adding logic to
parse_expr.c to allow giving warnings if an expression has changed meaning
because of these precedence changes.  These warnings are off by default
and are enabled by the new GUC operator_precedence_warning.  It's believed
that very few applications will be affected by these changes, but it was
agreed that a warning mechanism is essential to help debug any that are.
2015-03-11 13:22:52 -04:00