postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-02-23 18:04:41 -05:00

Author	SHA1	Message	Date
Andres Freund	b8d7f053c5	Faster expression evaluation and targetlist projection. This replaces the old, recursive tree-walk based evaluation, with non-recursive, opcode dispatch based, expression evaluation. Projection is now implemented as part of expression evaluation. This both leads to significant performance improvements, and makes future just-in-time compilation of expressions easier. The speed gains primarily come from: - non-recursive implementation reduces stack usage / overhead - simple sub-expressions are implemented with a single jump, without function calls - sharing some state between different sub-expressions - reduced amount of indirect/hard to predict memory accesses by laying out operation metadata sequentially; including the avoidance of nearly all of the previously used linked lists - more code has been moved to expression initialization, avoiding constant re-checks at evaluation time Future just-in-time compilation (JIT) has become easier, as demonstrated by released patches intended to be merged in a later release, for primarily two reasons: Firstly, due to a stricter split between expression initialization and evaluation, less code has to be handled by the JIT. Secondly, due to the non-recursive nature of the generated "instructions", less performance-critical code-paths can easily be shared between interpreted and compiled evaluation. The new framework allows for significant future optimizations. E.g.: - basic infrastructure for to later reduce the per executor-startup overhead of expression evaluation, by caching state in prepared statements. That'd be helpful in OLTPish scenarios where initialization overhead is measurable. - optimizing the generated "code". A number of proposals for potential work has already been made. - optimizing the interpreter. Similarly a number of proposals have been made here too. The move of logic into the expression initialization step leads to some backward-incompatible changes: - Function permission checks are now done during expression initialization, whereas previously they were done during execution. In edge cases this can lead to errors being raised that previously wouldn't have been, e.g. a NULL array being coerced to a different array type previously didn't perform checks. - The set of domain constraints to be checked, is now evaluated once during expression initialization, previously it was re-built every time a domain check was evaluated. For normal queries this doesn't change much, but e.g. for plpgsql functions, which caches ExprStates, the old set could stick around longer. The behavior around might still change. Author: Andres Freund, with significant changes by Tom Lane, changes by Heikki Linnakangas Reviewed-By: Tom Lane, Heikki Linnakangas Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de	2017-03-25 14:52:06 -07:00
Robert Haas	61c2e1a95f	Improve access to parallel query from procedural languages. In SQL, the ability to use parallel query was previous contingent on fcache->readonly_func, which is only set for non-volatile functions; but the volatility of a function has no bearing on whether queries inside it can use parallelism. Remove that condition. SPI_execute and SPI_execute_with_args always run the plan just once, though not necessarily to completion. Given the changes in commit `691b8d5928`, it's sensible to pass CURSOR_OPT_PARALLEL_OK here, so do that. This improves access to parallelism for any caller that uses these functions to execute queries. Such callers include plperl, plpython, pltcl, and plpgsql, though it's not the case that they all use these functions exclusively. In plpgsql, allow parallel query for plain SELECT queries (as opposed to PERFORM, which already worked) and for plain expressions (which probably won't go through the executor at all, because they will likely be simple expressions, but if they do then this helps). Rafia Sabih and Robert Haas, reviewed by Dilip Kumar and Amit Kapila Discussion: http://postgr.es/m/CAOGQiiMfJ+4SQwgG=6CVHWoisiU0+7jtXSuiyXBM3y=A=eJzmg@mail.gmail.com	2017-03-24 14:46:33 -04:00
Robert Haas	f120b614e0	plpgsql: Don't generate parallel plans for RETURN QUERY. Commit `7aea8e4f2d` allowed a parallel plan to be generated when for a RETURN QUERY or RETURN QUERY EXECUTE statement in a PL/pgsql block, but that's a bad idea because plplgsql asks the executor for 50 rows at a time. That means that we'll always be running serially a plan that was intended for parallel execution, which is not a good idea. Fix by not requesting a parallel plan from the outset. Per discussion, back-patch to 9.6. There is a slight risk that, due to optimizer error, somebody could have a case where the parallel plan executed serially is actually faster than the supposedly-best serial plan, but the consensus seems to be that that's not sufficient justification for leaving 9.6 unpatched. Discussion: http://postgr.es/m/CA+TgmoZ_ZuH+auEeeWnmtorPsgc_SmP+XWbDsJ+cWvWBSjNwDQ@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmobXEhvHbJtWDuPZM9bVSLiTj-kShxQJ2uM5GPDze9fRYA@mail.gmail.com	2017-03-24 12:30:39 -04:00
Peter Eisentraut	f97a028d8e	Spelling fixes in code comments From: Josh Soref <jsoref@gmail.com>	2017-03-14 12:58:39 -04:00
Tom Lane	08da52859a	Bring plpgsql into line with header inclusion policy. We have a project policy that every .c file should start by including postgres.h, postgres_fe.h, or c.h as appropriate; and then there is no need for any .h file to explicitly include any of these. (The core reason for this policy is to make it easy to verify that pg_config_os.h is included before any system headers such as <stdio.h>; without that, we have portability issues on some platforms due to variation in largefile options across different modules in the backend. Also, if .h files were responsible for choosing which of these key headers to include, .h files that need to be includable in either frontend or backend compiles would be in trouble.) plpgsql was blithely ignoring this policy, so whack it upside the head until it complies. I also chose to standardize on including plpgsql's own .h files after all core-system headers that it pulls in. That could've been done either way, but this way seems saner. Discussion: https://postgr.es/m/CAEepm=2zCoeq3QxVwhS5DFeUh=yU6z81pbWMgfOB8OzyiBwxzw@mail.gmail.com Discussion: https://postgr.es/m/11634.1488932128@sss.pgh.pa.us	2017-03-08 17:21:08 -05:00
Peter Eisentraut	38d103763d	Make more use of castNode()	2017-02-21 11:59:09 -05:00
Tom Lane	7afd56c3c6	Use castNode() in a bunch of statement-list-related code. When I wrote commit `ab1f0c822`, I really missed the castNode() macro that Peter E. had proposed shortly before. This back-fills the uses I would have put it to. It's probably not all that significant, but there are more assertions here than there were before, and conceivably they will help catch any bugs associated with those representation changes. I left behind a number of usages like "(Query *) copyObject(query_var)". Those could have been converted as well, but Peter has proposed another notational improvement that would handle copyObject cases automatically, so I let that be for now.	2017-01-26 22:09:34 -05:00
Peter Eisentraut	f21a563d25	Move some things from builtins.h to new header files This avoids that builtins.h has to include additional header files.	2017-01-20 20:29:53 -05:00
Andres Freund	ea15e18677	Remove obsoleted code relating to targetlist SRF evaluation. Since `69f4b9c` plain expression evaluation (and thus normal projection) can't return sets of tuples anymore. Thus remove code dealing with that possibility. This will require adjustments in external code using ExecEvalExpr()/ExecProject() - that should neither be hard nor very common. Author: Andres Freund and Tom Lane Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de	2017-01-19 14:40:41 -08:00
Tom Lane	ab1f0c8225	Change representation of statement lists, and add statement location info. This patch makes several changes that improve the consistency of representation of lists of statements. It's always been the case that the output of parse analysis is a list of Query nodes, whatever the types of the individual statements in the list. This patch brings similar consistency to the outputs of raw parsing and planning steps: * The output of raw parsing is now always a list of RawStmt nodes; the statement-type-dependent nodes are one level down from that. * The output of pg_plan_queries() is now always a list of PlannedStmt nodes, even for utility statements. In the case of a utility statement, "planning" just consists of wrapping a CMD_UTILITY PlannedStmt around the utility node. This list representation is now used in Portal and CachedPlan plan lists, replacing the former convention of intermixing PlannedStmts with bare utility-statement nodes. Now, every list of statements has a consistent head-node type depending on how far along it is in processing. This allows changing many places that formerly used generic "Node *" pointers to use a more specific pointer type, thus reducing the number of IsA() tests and casts needed, as well as improving code clarity. Also, the post-parse-analysis representation of DECLARE CURSOR is changed so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained SELECT remains a child of the DeclareCursorStmt rather than getting flipped around to be the other way. It's now true for both Query and PlannedStmt that utilityStmt is non-null if and only if commandType is CMD_UTILITY. That allows simplifying a lot of places that were testing both fields. (I think some of those were just defensive programming, but in many places, it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.) Because PlannedStmt carries a canSetTag field, we're also able to get rid of some ad-hoc rules about how to reconstruct canSetTag for a bare utility statement; specifically, the assumption that a utility is canSetTag if and only if it's the only one in its list. While I see no near-term need for relaxing that restriction, it's nice to get rid of the ad-hocery. The API of ProcessUtility() is changed so that what it's passed is the wrapper PlannedStmt not just the bare utility statement. This will affect all users of ProcessUtility_hook, but the changes are pretty trivial; see the affected contrib modules for examples of the minimum change needed. (Most compilers should give pointer-type-mismatch warnings for uncorrected code.) There's also a change in the API of ExplainOneQuery_hook, to pass through cursorOptions instead of expecting hook functions to know what to pick. This is needed because of the DECLARE CURSOR changes, but really should have been done in 9.6; it's unlikely that any extant hook functions know about using CURSOR_OPT_PARALLEL_OK. Finally, teach gram.y to save statement boundary locations in RawStmt nodes, and pass those through to Query and PlannedStmt nodes. This allows more intelligent handling of cases where a source query string contains multiple statements. This patch doesn't actually do anything with the information, but a follow-on patch will. (Passing this information through cleanly is the true motivation for these changes; while I think this is all good cleanup, it's unlikely we'd have bothered without this end goal.) catversion bump because addition of location fields to struct Query affects stored rules. This patch is by me, but it owes a good deal to Fabien Coelho who did a lot of preliminary work on the problem, and also reviewed the patch. Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre	2017-01-14 16:02:35 -05:00
Bruce Momjian	1d25779284	Update copyright via script for 2017	2017-01-03 13:48:53 -05:00
Tom Lane	55caaaeba8	Improve handling of array elements as getdiag_targets and cursor_variables. There's no good reason why plpgsql's GET DIAGNOSTICS statement can't support an array element as target variable, since the execution code already uses the generic exec_assign_value() function to assign to it. Hence, refactor the grammar to allow that, by making getdiag_target depend on the assign_var production. Ideally we'd also let a cursor_variable expand to an element of a refcursor[] array, but that's substantially harder since those statements also have to handle bound-cursor-variable cases. For now, just make sure the reported error is sensible, ie "cursor variable must be a simple variable" not "variable must be of type cursor or refcursor". The latter was quite confusing from the user's viewpoint, since what he wrote satisfies the claimed restriction. Per bug #14463 from Zhou Digoal. Given the lack of previous complaints, I see no need for a back-patch. Discussion: https://postgr.es/m/20161213152548.14897.81245@wrigleys.postgresql.org	2016-12-13 16:33:03 -05:00
Tom Lane	1833f1a1c3	Simplify code by getting rid of SPI_push, SPI_pop, SPI_restore_connection. The idea behind SPI_push was to allow transitioning back into an "unconnected" state when a SPI-using procedure calls unrelated code that might or might not invoke SPI. That sounds good, but in practice the only thing it does for us is to catch cases where a called SPI-using function forgets to call SPI_connect --- which is a highly improbable failure mode, since it would be exposed immediately by direct testing of said function. As against that, we've had multiple bugs induced by forgetting to call SPI_push/SPI_pop around code that might invoke SPI-using functions; these are much harder to catch and indeed have gone undetected for years in some cases. And we've had to band-aid around some problems of this ilk by introducing conditional push/pop pairs in some places, which really kind of defeats the purpose altogether; if we can't draw bright lines between connected and unconnected code, what's the point? Hence, get rid of SPI_push[_conditional], SPI_pop[_conditional], and the underlying state variable _SPI_curid. It turns out SPI_restore_connection can go away too, which is a nice side benefit since it was never more than a kluge. Provide no-op macros for the deleted functions so as to avoid an API break for external modules. A side effect of this removal is that SPI_palloc and allied functions no longer permit being called when unconnected; they'll throw an error instead. The apparent usefulness of the previous behavior was a mirage as well, because it was depended on by only a few places (which I fixed in preceding commits), and it posed a risk of allocations being unexpectedly long-lived if someone forgot a SPI_push call. Discussion: <20808.1478481403@sss.pgh.pa.us>	2016-11-08 17:39:57 -05:00
Tom Lane	9257f07872	Replace uses of SPI_modifytuple that intend to allocate in current context. Invent a new function heap_modify_tuple_by_cols() that is functionally equivalent to SPI_modifytuple except that it always allocates its result by simple palloc. I chose however to make the API details a bit more like heap_modify_tuple: pass a tupdesc rather than a Relation, and use bool convention for the isnull array. Use this function in place of SPI_modifytuple at all call sites where the intended behavior is to allocate in current context. (There actually are only two call sites left that depend on the old behavior, which makes me wonder if we should just drop this function rather than keep it.) This new function is easier to use than heap_modify_tuple() for purposes of replacing a single column (or, really, any fixed number of columns). There are a number of places where it would simplify the code to change over, but I resisted that temptation for the moment ... everywhere except in plpgsql's exec_assign_value(); changing that might offer some small performance benefit, so I did it. This is on the way to removing SPI_push/SPI_pop, but it seems like good code cleanup in its own right. Discussion: <9633.1478552022@sss.pgh.pa.us>	2016-11-08 15:36:44 -05:00
Tom Lane	fc8b81a291	Need to do SPI_push/SPI_pop around expression evaluation in plpgsql. We must do this in case the expression evaluation results in calling another plpgsql function (or, really, anything using SPI). I missed the need for this when I converted exec_cast_value() from doing a simple InputFunctionCall() to doing ExecEvalExpr() in commit `1345cc67b`. There is a SPI_push_conditional in InputFunctionCall(), so that there was no bug before that. Per bug #14414 from Marcos Castedo. Add a regression test based on his example, which was that a plpgsql function in a domain check constraint didn't work when assigning to a domain-type variable within plpgsql. Report: <20161106010947.1387.66380@wrigleys.postgresql.org>	2016-11-06 12:09:36 -05:00
Tom Lane	a4c35ea1c2	Improve parser's and planner's handling of set-returning functions. Teach the parser to reject misplaced set-returning functions during parse analysis using p_expr_kind, in much the same way as we do for aggregates and window functions (cf commit `eaccfded9`). While this isn't complete (it misses nesting-based restrictions), it's much better than the previous error reporting for such cases, and it allows elimination of assorted ad-hoc expression_returns_set() error checks. We could add nesting checks later if it seems important to catch all cases at parse time. There is one case the parser will now throw error for although previous versions allowed it, which is SRFs in the tlist of an UPDATE. That never behaved sensibly (since it's ill-defined which generated row should be used to perform the update) and it's hard to see why it should not be treated as an error. It's a release-note-worthy change though. Also, add a new Query field hasTargetSRFs reporting whether there are any SRFs in the targetlist (including GROUP BY/ORDER BY expressions). The parser can now set that basically for free during parse analysis, and we can use it in a number of places to avoid expression_returns_set searches. (There will be more such checks soon.) In some places, this allows decontorting the logic since it's no longer expensive to check for SRFs in the tlist --- so I made the checks parallel to the handling of hasAggs/hasWindowFuncs wherever it seemed appropriate. catversion bump because adding a Query field changes stored rules. Andres Freund and Tom Lane Discussion: <24639.1473782855@sss.pgh.pa.us>	2016-09-13 13:54:24 -04:00
Peter Eisentraut	e0013deb59	Make better use of existing enums in plpgsql plpgsql.h defines a number of enums, but most of the code passes them around as ints. Update structs and function prototypes to take enum types instead. This clarifies the struct definitions in plpgsql.h in particular. Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>	2016-09-09 12:00:00 -04:00
Tom Lane	ea268cdc9a	Add macros to make AllocSetContextCreate() calls simpler and safer. I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls had typos in the context-sizing parameters. While none of these led to especially significant problems, they did create minor inefficiencies, and it's now clear that expecting people to copy-and-paste those calls accurately is not a great idea. Let's reduce the risk of future errors by introducing single macros that encapsulate the common use-cases. Three such macros are enough to cover all but two special-purpose contexts; those two calls can be left as-is, I think. While this patch doesn't in itself improve matters for third-party extensions, it doesn't break anything for them either, and they can gradually adopt the simplified notation over time. In passing, change TopMemoryContext to use the default allocation parameters. Formerly it could only be extended 8K at a time. That was probably reasonable when this code was written; but nowadays we create many more contexts than we did then, so that it's not unusual to have a couple hundred K in TopMemoryContext, even without considering various dubious code that sticks other things there. There seems no good reason not to let it use growing blocks like most other contexts. Back-patch to 9.6, mostly because that's still close enough to HEAD that it's easy to do so, and keeping the branches in sync can be expected to avoid some future back-patching pain. The bugs fixed by these changes don't seem to be significant enough to justify fixing them further back. Discussion: <21072.1472321324@sss.pgh.pa.us>	2016-08-27 17:50:38 -04:00
Tom Lane	5697522d84	In plpgsql, don't try to convert int2vector or oidvector to expanded array. These types are storage-compatible with real arrays, but they don't support toasting, so of course they can't support expansion either. Per bug #14289 from Michael Overmeyer. Back-patch to 9.5 where expanded arrays were introduced. Report: <20160818174414.1529.37913@wrigleys.postgresql.org>	2016-08-18 14:49:08 -04:00
Peter Eisentraut	9f31e45a6d	Improve formatting of comments in plpgsql.h This file had some unusual comment layout. Most of the comments introducing structs ended up to the right of the screen and following the start of the struct. Some comments for struct members ended up after the member definition. Fix that by moving comments consistently before what they are describing. Also add missing struct tags where missing so that it is easier to tell what the struct is.	2016-08-18 12:00:00 -04:00
Tom Lane	bfaaacc805	Improve plpgsql's memory management to fix some function-lifespan leaks. In some cases, exiting out of a plpgsql statement due to an error, then catching the error in a surrounding exception block, led to leakage of temporary data the statement was working with, because we kept all such data in the function-lifespan SPI Proc context. Iterating such behavior many times within one function call thus led to noticeable memory bloat. To fix, create an additional memory context meant to have statement lifespan. Since many plpgsql statements, particularly the simpler/more common ones, don't need this, create it only on demand. Reset this context at the end of any statement that uses it, and arrange for exception cleanup to reset it too, thereby fixing the memory-leak issue. Allow a stack of such contexts to exist to handle cases where a compound statement needs statement-lifespan data that persists across calls of inner statements. While at it, clean up code and improve comments referring to the existing short-term memory context, which by plpgsql convention is the per-tuple context of the eval_econtext ExprContext. We now uniformly refer to that as the eval_mcontext, whereas the new statement-lifespan memory contexts are called stmt_mcontext. This change adds some context-creation overhead, but on the other hand it allows removal of some retail pfree's in favor of context resets. On balance it seems to be about a wash performance-wise. In principle this is a bug fix, but it seems too invasive for a back-patch, and the infrequency of complaints weighs against taking the risk in the back branches. So we'll fix it only in HEAD, at least for now. Tom Lane, reviewed by Pavel Stehule Discussion: <17863.1469142152@sss.pgh.pa.us>	2016-08-17 14:51:10 -04:00
Tom Lane	0bb51aa967	Improve parsetree representation of special functions such as CURRENT_DATE. We implement a dozen or so parameterless functions that the SQL standard defines special syntax for. Up to now, that was done by converting them into more or less ad-hoc constructs such as "'now'::text::date". That's messy for multiple reasons: it exposes what should be implementation details to users, and performance is worse than it needs to be in several cases. To improve matters, invent a new expression node type SQLValueFunction that can represent any of these parameterless functions. Bump catversion because this changes stored parsetrees for rules. Discussion: <30058.1463091294@sss.pgh.pa.us>	2016-08-16 20:33:01 -04:00
Peter Eisentraut	34927b2920	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: cda21c1d7b160b303dc21dfe9d4169f2c8064c60	2016-08-08 11:08:00 -04:00
Tom Lane	baebab3ace	Allow IMPORT FOREIGN SCHEMA within pl/pgsql. Since IMPORT FOREIGN SCHEMA has an INTO clause, pl/pgsql needs to be aware of that and avoid capturing the INTO as an INTO-variables clause. This isn't hard, though it's annoying to have to make IMPORT a plpgsql keyword just for this. (Fortunately, we have the infrastructure now to make it an unreserved keyword, so at least this shouldn't break any existing pl/pgsql code.) Per report from Merlin Moncure. Back-patch to 9.5 where IMPORT FOREIGN SCHEMA was introduced. Report: <CAHyXU0wpHf2bbtKGL1gtUEFATCY86r=VKxfcACVcTMQ70mCyig@mail.gmail.com>	2016-07-12 18:07:03 -04:00
Tom Lane	1fe1204e87	Add missing check for malloc failure in plpgsql_extra_checks_check_hook(). Per report from Andreas Seltenreich. Back-patch to affected versions. Report: <874m8nn0hv.fsf@elite.ansel.ydns.eu>	2016-06-20 15:36:54 -04:00
Peter Eisentraut	47981a4665	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 0c374f8d25ed31833a10d24252bc928d41438838	2016-06-20 09:48:08 -04:00
Robert Haas	4bc424b968	pgindent run for 9.6	2016-06-09 18:02:36 -04:00
Peter Eisentraut	48aaba4acf	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 17bf3e8564abf600274789fcc90e72532d5e7c05	2016-05-09 10:04:41 -04:00
Tom Lane	23a27b039d	Widen query numbers-of-tuples-processed counters to uint64. This patch widens SPI_processed, EState's es_processed field, PortalData's portalPos field, FuncCallContext's call_cntr and max_calls fields, ExecutorRun's count argument, PortalRunFetch's result, and the max number of rows in a SPITupleTable to uint64, and deals with (I hope) all the ensuing fallout. Some of these values were declared uint32 before, and others "long". I also removed PortalData's posOverflow field, since that logic seems pretty useless given that portalPos is now always 64 bits. The user-visible results are that command tags for SELECT etc will correctly report tuple counts larger than 4G, as will plpgsql's GET GET DIAGNOSTICS ... ROW_COUNT command. Queries processing more tuples than that are still not exactly the norm, but they're becoming more common. Most values associated with FETCH/MOVE distances, such as PortalRun's count argument and the count argument of most SPI functions that have one, remain declared as "long". It's not clear whether it would be worth promoting those to int64; but it would definitely be a large dollop of additional API churn on top of this, and it would only help 32-bit platforms which seem relatively less likely to see any benefit. Andreas Scherbaum, reviewed by Christian Ullrich, additional hacking by me	2016-03-12 16:05:29 -05:00
Magnus Hagander	6c90996a4c	Add prefix to pl/pgsql global variables and functions Rename pl/pgsql global variables to always have a plpgsql_ prefix, so they don't conflict with other shared libraries loaded.	2016-03-03 10:45:59 +01:00
Bruce Momjian	ee94300446	Update copyright for 2016 Backpatch certain files through 9.1	2016-01-02 13:33:40 -05:00
Robert Haas	1efc7e5382	Fix problems with ParamListInfo serialization mechanism. Commit `d1b7c1ffe7` introduced a mechanism for serializing a ParamListInfo structure to be passed to a parallel worker. However, this mechanism failed to handle external expanded values, as pointed out by Noah Misch. Repair. Moreover, plpgsql_param_fetch requires adjustment because the serialization mechanism needs it to skip evaluating unused parameters just as we would do when it is called from copyParamList, but params == estate->paramLI in that case. To fix, make the bms_is_member test in that function unconditional. Finally, have setup_param_list set a new ParamListInfo field, paramMask, to the parameters actually used in the expression, so that we don't try to fetch those that are not needed when serializing a parameter list. This isn't necessary for correctness, but it makes the performance of the parallel executor code comparable to what we do for cases involving cursors. Design suggestions and extensive review by Noah Misch. Patch by me.	2015-11-02 18:11:29 -05:00
Robert Haas	7aea8e4f2d	Determine whether it's safe to attempt a parallel plan for a query. Commit `924bcf4f16` introduced a framework for parallel computation in PostgreSQL that makes most but not all built-in functions safe to execute in parallel mode. In order to have parallel query, we'll need to be able to determine whether that query contains functions (either built-in or user-defined) that cannot be safely executed in parallel mode. This requires those functions to be labeled, so this patch introduces an infrastructure for that. Some functions currently labeled as safe may need to be revised depending on how pending issues related to heavyweight locking under paralllelism are resolved. Parallel plans can't be used except for the case where the query will run to completion. If portal execution were suspended, the parallel mode restrictions would need to remain in effect during that time, but that might make other queries fail. Therefore, this patch introduces a framework that enables consideration of parallel plans only when it is known that the plan will be run to completion. This probably needs some refinement; for example, at bind time, we do not know whether a query run via the extended protocol will be execution to completion or run with a limited fetch count. Having the client indicate its intentions at bind time would constitute a wire protocol break. Some contexts in which parallel mode would be safe are not adjusted by this patch; the default is not to try parallel plans except from call sites that have been updated to say that such plans are OK. This commit doesn't introduce any parallel paths or plans; it just provides a way to determine whether they could potentially be used. I'm committing it on the theory that the remaining parallel sequential scan patches will also get committed to this release, hopefully in the not-too-distant future. Robert Haas and Amit Kapila. Reviewed (in earlier versions) by Noah Misch.	2015-09-16 15:38:47 -04:00
Tom Lane	0426f349ef	Rearrange the handling of error context reports. Remove the code in plpgsql that suppressed the innermost line of CONTEXT for messages emitted by RAISE commands. That was never more than a quick backwards-compatibility hack, and it's pretty silly in cases where the RAISE is nested in several levels of function. What's more, it violated our design theory that verbosity of error reports should be controlled on the client side not the server side. To alleviate the resulting noise increase, introduce a feature in libpq and psql whereby the CONTEXT field of messages can be suppressed, either always or only for non-error messages. Printing CONTEXT for errors only is now their default behavior. The actual code changes here are pretty small, but the effects on the regression test outputs are widespread. I had to edit some of the alternative expected outputs by hand; hopefully the buildfarm will soon find anything I fat-fingered. In passing, fix up (again) the output line counts in psql's various help displays. Add some commentary about how to verify them. Pavel Stehule, reviewed by Petr Jelínek, Jeevan Chalke, and others	2015-09-05 11:58:33 -04:00
Tom Lane	781ed2bfa3	Further tweak wording of error messages about bad CONTINUE/EXIT statements. Per discussion, a little more verbosity seems called for.	2015-08-25 14:06:13 -04:00
Tom Lane	18391a8f06	Tweak wording of syntax error messages about bad CONTINUE/EXIT statements. Try to avoid any possible confusion about what these messages mean.	2015-08-23 17:34:47 -04:00
Tom Lane	fcdfce6820	Detect mismatched CONTINUE and EXIT statements at plpgsql compile time. With a bit of tweaking of the compile namestack data structure, we can verify at compile time whether a CONTINUE or EXIT is legal. This is surely better than leaving it to runtime, both because earlier is better and because we can issue a proper error pointer. Also, we can get rid of the ad-hoc old way of detecting the problem, which only took care of CONTINUE not EXIT. Jim Nasby, adjusted a bit by me	2015-08-21 20:17:19 -04:00
Tom Lane	2edb949115	Fix a few bogus statement type names in plpgsql error messages. plpgsql's error location context messages ("PL/pgSQL function fn-name line line-no at stmt-type") would misreport a CONTINUE statement as being an EXIT, and misreport a MOVE statement as being a FETCH. These are clear bugs that have been there a long time, so back-patch to all supported branches. In addition, in 9.5 and HEAD, change the description of EXECUTE from "EXECUTE statement" to just plain EXECUTE; there seems no good reason why this statement type should be described differently from others that have a well-defined head keyword. And distinguish GET STACKED DIAGNOSTICS from plain GET DIAGNOSTICS. These are a bit more of a judgment call, and also affect existing regression-test outputs, so I did not back-patch into stable branches. Pavel Stehule and Tom Lane	2015-08-18 19:22:37 -04:00
Tom Lane	d3eaab3ef0	Fix performance bug from conflict between two previous improvements. My expanded-objects patch (commit `1dc5ebc907`) included code to make plpgsql pass expanded-object variables as R/W pointers to certain functions that are trusted for modifying such variables in-place. However, that optimization got broken by commit `6c82d8d1fd`, which arranged to share a single ParamListInfo across most expressions evaluated by a plpgsql function. We don't want a R/W pointer to be passed to other functions just because we decided one function was safe! Fortunately, the breakage was in the other direction, of never passing a R/W pointer at all, because we'd always have pre-initialized the shared array slot with a R/O pointer. So it was still functionally correct, but we were back to O(N^2) performance for repeated use of "a := a \|\| x". To fix, force an unshared param array to be used when the R/W param optimization is active. Commit `6c82d8d1fd` is in HEAD only, so no need for a back-patch.	2015-08-17 19:39:35 -04:00
Tom Lane	83604cc423	Repair unsafe use of shared typecast-lookup table in plpgsql DO blocks. DO blocks use private simple_eval_estates to avoid intra-transaction memory leakage, cf commit `c7b849a89`. I had forgotten about that while writing commit `0fc94a5ba`, but it means that expression execution trees created within a DO block disappear immediately on exiting the DO block, and hence can't safely be linked into plpgsql's session-wide cast hash table. To fix, give a DO block a private cast hash table to go with its private simple_eval_estate. This is less efficient than one could wish, since DO blocks can no longer share any cast lookup work with other plpgsql execution, but it shouldn't be too bad; in any case it's no worse than what happened in DO blocks before commit `0fc94a5ba`. Per bug #13571 from Feike Steenbergen. Preliminary analysis by Oleksandr Shulgin.	2015-08-15 12:00:48 -04:00
Tom Lane	0fc94a5bab	Repair mishandling of cached cast-expression trees in plpgsql. In commit `1345cc67bb`, I introduced caching of expressions representing type-cast operations into plpgsql. However, I supposed that I could cache both the expression trees and the evaluation state trees derived from them for the life of the session. This doesn't work, because we execute the expressions in plpgsql's simple_eval_estate, which has an ecxt_per_query_memory that is only transaction-lifespan. Therefore we can end up putting pointers into the evaluation state tree that point to transaction-lifespan memory; in particular this happens if the cast expression calls a SQL-language function, as reported by Geoff Winkless. The minimum-risk fix seems to be to treat the state trees the same way we do for "simple expression" trees in plpgsql, ie create them in the simple_eval_estate's ecxt_per_query_memory, which means recreating them once per transaction. Since I had to introduce bookkeeping overhead for that anyway, I bought back some of the added cost by sharing the read-only expression trees across all functions in the session, instead of using a per-function table as originally. The simple-expression bookkeeping takes care of the recursive-usage risk that I was concerned about avoiding before. At some point we should take a harder look at how all this works, and see if we can't reduce the amount of tree reinitialization needed. But that won't happen for 9.5.	2015-07-17 15:53:09 -04:00
Tom Lane	6c82d8d1fd	Further reduce overhead for passing plpgsql variables to the executor. This builds on commit `21dcda2713` by keeping a plpgsql function's shared ParamListInfo's entries for simple variables (PLPGSQL_DTYPE_VARs) valid at all times. That adds a few cycles to each assignment to such variables, but saves significantly more cycles each time they are used; so except in the pathological case of many dead stores, this should always be a win. Initial testing says it's good for about a 10% speedup of simple calculations; more in large functions with many datums. We can't use this method for row/record references unfortunately, so what we do for those is reset those ParamListInfo slots after use; which we can skip doing unless some of them were actually evaluated during the previous evaluation call. So this should frequently be a win as well, while worst case is that it's similar cost to the previous approach. Also, closer study suggests that the previous method of instantiating a new ParamListInfo array per evaluation is actually probably optimal for cursor-opening executor calls. The reason is that whatever is visible in the array is going to get copied into the cursor portal via copyParamList. So if we used the function's main ParamListInfo for those calls, we'd end up with all of its DTYPE_VAR vars getting copied, which might well include large pass-by-reference values that the cursor actually has no need for. To avoid a possible net degradation in cursor cases, go back to creating and filling a private ParamListInfo in those cases (which therefore will be exactly the same speed as before `21dcda2713`). We still get some benefit out of this though, because this approach means that we only have to defend against copyParamList's try-to-fetch-every-slot behavior in the case of an unshared ParamListInfo; so plpgsql_param_fetch() can skip testing expr->paramnos in the common case. To ensure that the main ParamListInfo's image of a DTYPE_VAR datum is always valid, all assignments to such variables are now funneled through assign_simple_var(). But this makes for cleaner and shorter code anyway.	2015-07-05 12:57:17 -04:00
Peter Eisentraut	c5e5d444de	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: fb7e72f46cfafa1b5bfe4564d9686d63a1e6383f	2015-06-28 23:56:55 -04:00
Tom Lane	ae58f1430a	Fix failure to cover scalar-vs-rowtype cases in exec_stmt_return(). In commit `9e3ad1aac5` I modified plpgsql to use exec_stmt_return's simple-variables fast path in more cases. However, I overlooked that there are really two different return conventions in use here, depending on whether estate->retistuple is true, and the existing fast-path code had only bothered to handle one of them. So trying to return a scalar in a function returning composite, or vice versa, could lead to unexpected error messages (typically "cache lookup failed for type 0") or to a null-pointer-dereference crash. In the DTYPE_VAR case, we can just throw error if retistuple is true, corresponding to what happens in the general-expression code path that was being used previously. (Perhaps someday both of these code paths should attempt a coercion, but today is not that day.) In the REC and ROW cases, just hand the problem to exec_eval_datum() when not retistuple. Also clean up the ROW coding slightly so it looks more like exec_eval_datum(). The previous commit also caused exec_stmt_return_next() to be used in more cases, but that code seems to be OK as-is. Per off-list report from Serge Rielau. This bug is new in 9.5 so no need to back-patch.	2015-06-12 13:44:06 -04:00
Tom Lane	1dc5ebc907	Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund	2015-05-14 12:08:49 -04:00
Tom Lane	785941cdc3	Tweak __attribute__-wrapping macros for better pgindent results. This improves on commit `bbfd7edae5` by making two simple changes: * pg_attribute_noreturn now takes parentheses, ie pg_attribute_noreturn(). Likewise pg_attribute_unused(), pg_attribute_packed(). This reduces pgindent's tendency to misformat declarations involving them. * attributes are now always attached to function declarations, not definitions. Previously some places were taking creative shortcuts, which were not merely candidates for bad misformatting by pgindent but often were outright wrong anyway. (It does little good to put a noreturn annotation where callers can't see it.) In any case, if we would like to believe that these macros can be used with non-gcc compilers, we should avoid gratuitous variance in usage patterns. I also went through and manually improved the formatting of a lot of declarations, and got rid of excessively repetitive (and now obsolete anyway) comments informing the reader what pg_attribute_printf is for.	2015-03-26 14:03:25 -04:00
Tom Lane	a4847fc3ef	Add an ASSERT statement in plpgsql. This is meant to make it easier to insert simple debugging cross-checks in plpgsql functions. Pavel Stehule, reviewed by Jim Nasby	2015-03-25 19:05:32 -04:00
Alvaro Herrera	13dbc7a824	array_offset() and array_offsets() These functions return the offset position or positions of a value in an array. Author: Pavel Stěhule Reviewed by: Jim Nasby	2015-03-18 16:01:34 -03:00
Tom Lane	5ff683962e	Remove obsolete comment. Obsoleted by commit `21dcda2713`, but I missed seeing the cross-reference in the comments for exec_eval_integer(). Also improve the cross-reference in the comments for exec_eval_cleanup().	2015-03-14 17:07:11 -04:00
Tom Lane	c6b3c939b7	Make operator precedence follow the SQL standard more closely. While the SQL standard is pretty vague on the overall topic of operator precedence (because it never presents a unified BNF for all expressions), it does seem reasonable to conclude from the spec for <boolean value expression> that OR has the lowest precedence, then AND, then NOT, then IS tests, then the six standard comparison operators, then everything else (since any non-boolean operator in a WHERE clause would need to be an argument of one of these). We were only sort of on board with that: most notably, while "<" ">" and "=" had properly low precedence, "<=" ">=" and "<>" were treated as generic operators and so had significantly higher precedence. And "IS" tests were even higher precedence than those, which is very clearly wrong per spec. Another problem was that "foo NOT SOMETHING bar" constructs, such as "x NOT LIKE y", were treated inconsistently because of a bison implementation artifact: they had the documented precedence with respect to operators to their right, but behaved like NOT (i.e., very low priority) with respect to operators to their left. Fixing the precedence issues is just a small matter of rearranging the precedence declarations in gram.y, except for the NOT problem, which requires adding an additional lookahead case in base_yylex() so that we can attach a different token precedence to NOT LIKE and allied two-word operators. The bulk of this patch is not the bug fix per se, but adding logic to parse_expr.c to allow giving warnings if an expression has changed meaning because of these precedence changes. These warnings are off by default and are enabled by the new GUC operator_precedence_warning. It's believed that very few applications will be affected by these changes, but it was agreed that a warning mechanism is essential to help debug any that are.	2015-03-11 13:22:52 -04:00

1 2 3 4 5 ...

756 commits