Commit graph

96 commits

Author SHA1 Message Date
Tom Lane
95b07bc7f5 Support window functions a la SQL:2008.
Hitoshi Harada, with some kibitzing from Heikki and Tom.
2008-12-28 18:54:01 +00:00
Tom Lane
76e6602417 Improve the recently-added code for inlining set-returning functions so that
it can handle functions returning setof record.  The case was left undone
originally, but it turns out to be simple to fix.
2008-10-09 19:27:40 +00:00
Tom Lane
e5536e77a5 Move exprType(), exprTypmod(), expression_tree_walker(), and related routines
into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside
the backend.  There's probably more that should be done along this line,
but this is a start anyway.
2008-08-25 22:42:34 +00:00
Tom Lane
bd3daddaf2 Arrange to convert EXISTS subqueries that are equivalent to hashable IN
subqueries into the same thing you'd have gotten from IN (except always with
unknownEqFalse = true, so as to get the proper semantics for an EXISTS).
I believe this fixes the last case within CVS HEAD in which an EXISTS could
give worse performance than an equivalent IN subquery.

The tricky part of this is that if the upper query probes the EXISTS for only
a few rows, the hashing implementation can actually be worse than the default,
and therefore we need to make a cost-based decision about which way to use.
But at the time when the planner generates plans for subqueries, it doesn't
really know how many times the subquery will be executed.  The least invasive
solution seems to be to generate both plans and postpone the choice until
execution.  Therefore, in a query that has been optimized this way, EXPLAIN
will show two subplans for the EXISTS, of which only one will actually get
executed.

There is a lot more that could be done based on this infrastructure: in
particular it's interesting to consider switching to the hash plan if we start
out using the non-hashed plan but find a lot more upper rows going by than we
expected.  I have therefore left some minor inefficiencies in place, such as
initializing both subplans even though we will currently only use one.
2008-08-22 00:16:04 +00:00
Tom Lane
e006a24ad1 Implement SEMI and ANTI joins in the planner and executor. (Semijoins replace
the old JOIN_IN code, but antijoins are new functionality.)  Teach the planner
to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti
joins respectively.  Also, LEFT JOINs with suitable upper-level IS NULL
filters are recognized as being anti joins.  Unify the InClauseInfo and
OuterJoinInfo infrastructure into "SpecialJoinInfo".  With that change,
it becomes possible to associate a SpecialJoinInfo with every join attempt,
which permits some cleanup of join selectivity estimation.  That needs to be
taken much further than this patch does, but the next step is to change the
API for oprjoin selectivity functions, which seems like material for a
separate patch.  So for the moment the output size estimates for semi and
especially anti joins are quite bogus.
2008-08-14 18:48:00 +00:00
Tom Lane
9511304752 Rearrange the querytree representation of ORDER BY/GROUP BY/DISTINCT items
as per my recent proposal:

1. Fold SortClause and GroupClause into a single node type SortGroupClause.
We were already relying on them to be struct-equivalent, so using two node
tags wasn't accomplishing much except to get in the way of comparing items
with equal().

2. Add an "eqop" field to SortGroupClause to carry the associated equality
operator.  This is cheap for the parser to get at the same time it's looking
up the sort operator, and storing it eliminates the need for repeated
not-so-cheap lookups during planning.  In future this will also let us
represent GROUP/DISTINCT operations on datatypes that have hash opclasses
but no btree opclasses (ie, they have equality but no natural sort order).
The previous representation simply didn't work for that, since its only
indicator of comparison semantics was a sort operator.

3. Add a hasDistinctOn boolean to struct Query to explicitly record whether
the distinctClause came from DISTINCT or DISTINCT ON.  This allows removing
some complicated and not 100% bulletproof code that attempted to figure
that out from the distinctClause alone.

This patch doesn't in itself create any new capability, but it's necessary
infrastructure for future attempts to use hash-based grouping for DISTINCT
and UNION/INTERSECT/EXCEPT.
2008-08-02 21:32:01 +00:00
Tom Lane
6b73d7e567 Fix an oversight I made in a cleanup patch over a year ago:
eval_const_expressions needs to be passed the PlannerInfo ("root") structure,
because in some cases we want it to substitute values for Param nodes.
(So "constant" is not so constant as all that ...)  This mistake partially
disabled optimization of unnamed extended-Query statements in 8.3: in
particular the LIKE-to-indexscan optimization would never be applied if the
LIKE pattern was passed as a parameter, and constraint exclusion depending
on a parameter value didn't work either.
2008-04-01 00:48:33 +00:00
Tom Lane
0d49838df6 Arrange to "inline" SQL functions that appear in a query's FROM clause,
are declared to return set, and consist of just a single SELECT.  We
can replace the FROM-item with a sub-SELECT and then optimize much as
if we were dealing with a view.  Patch from Richard Rowell, cleaned up
by me.
2008-03-18 22:04:14 +00:00
Bruce Momjian
9098ab9e32 Update copyrights in source tree to 2008. 2008-01-01 19:46:01 +00:00
Tom Lane
7c5e5439d2 Get rid of some old and crufty global variables in the planner. When
this code was last gone over, there wasn't really any alternative to
globals because we didn't have the PlannerInfo struct being passed all
through the planner code.  Now that we do, we can restructure things
to avoid non-reentrancy.  I'm fooling with this because otherwise I'd
have had to add another global variable for the planned compact
range table list.
2007-02-19 07:03:34 +00:00
Tom Lane
5a7471c307 Add COST and ROWS options to CREATE/ALTER FUNCTION, plus underlying pg_proc
columns procost and prorows, to allow simple user adjustment of the estimated
cost of a function call, as well as control of the estimated number of rows
returned by a set-returning function.  We might eventually wish to extend this
to allow function-specific estimation routines, but there seems to be
consensus that we should try a simple constant estimate first.  In particular
this provides a relatively simple way to control the order in which different
WHERE clauses are applied in a plan node, which is a Good Thing in view of the
fact that the recent EquivalenceClass planner rewrite made that much less
predictable than before.
2007-01-22 01:35:23 +00:00
Bruce Momjian
29dccf5fe0 Update CVS HEAD for 2007 copyright. Back branches are typically not
back-stamped for this.
2007-01-05 22:20:05 +00:00
Tom Lane
cffd89ca73 Revise the planner's handling of "pseudoconstant" WHERE clauses, that is
clauses containing no variables and no volatile functions.  Such a clause
can be used as a one-time qual in a gating Result plan node, to suppress
plan execution entirely when it is false.  Even when the clause is true,
putting it in a gating node wins by avoiding repeated evaluation of the
clause.  In previous PG releases, query_planner() would do this for
pseudoconstant clauses appearing at the top level of the jointree, but
there was no ability to generate a gating Result deeper in the plan tree.
To fix it, get rid of the special case in query_planner(), and instead
process pseudoconstant clauses through the normal RestrictInfo qual
distribution mechanism.  When a pseudoconstant clause is found attached to
a path node in create_plan(), pull it out and generate a gating Result at
that point.  This requires special-casing pseudoconstants in selectivity
estimation and cost_qual_eval, but on the whole it's pretty clean.
It probably even makes the planner a bit faster than before for the normal
case of no pseudoconstants, since removing pull_constant_clauses saves one
useless traversal of the qual tree.  Per gripe from Phil Frost.
2006-07-01 18:38:33 +00:00
Bruce Momjian
f2f5b05655 Update copyright for 2006. Update scripts. 2006-03-05 15:59:11 +00:00
Tom Lane
3a0a16cb7e Allow row comparisons to be used as indexscan qualifications.
This completes the project to upgrade our handling of row comparisons.
2006-01-25 20:29:24 +00:00
Tom Lane
e3b9852728 Teach planner how to rearrange join order for some classes of OUTER JOIN.
Per my recent proposal.  I ended up basing the implementation on the
existing mechanism for enforcing valid join orders of IN joins --- the
rules for valid outer-join orders are somewhat similar.
2005-12-20 02:30:36 +00:00
Bruce Momjian
1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane
e2159f3842 Teach the planner to remove SubqueryScan nodes from the plan if they
aren't doing anything useful (ie, neither selection nor projection).
Also, extend to SubqueryScan the hacks already in place to avoid
unnecessary ExecProject calls when the result would just be the same
tuple the subquery already delivered.  This saves some overhead in
UNION and other set operations, as well as avoiding overhead for
unflatten-able subqueries.  Per example from Sokolov Yura.
2005-05-22 22:30:20 +00:00
Tom Lane
0bf2587df4 Improve planner's estimation of the space needed for HashAgg plans:
look at the actual aggregate transition datatypes and the actual overhead
needed by nodeAgg.c, instead of using pessimistic round numbers.
Per a discussion with Michael Tiemann.
2005-01-28 19:34:28 +00:00
PostgreSQL Daemon
2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Tom Lane
9309d5f2ba In ALTER COLUMN TYPE, strip any implicit coercion operations appearing
at the top level of the column's old default expression before adding
an implicit coercion to the new column type.  This seems to satisfy the
principle of least surprise, as per discussion of bug #1290.
2004-10-22 17:20:05 +00:00
Bruce Momjian
da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane
7643bed58e When using extended-query protocol, postpone planning of unnamed statements
until Bind is received, so that actual parameter values are visible to the
planner.  Make use of the parameter values for estimation purposes (but
don't fold them into the actual plan).  This buys back most of the
potential loss of plan quality that ensues from using out-of-line
parameters instead of putting literal values right into the query text.

This patch creates a notion of constant-folding expressions 'for
estimation purposes only', in which case we can be more aggressive than
the normal eval_const_expressions() logic can be.  Right now the only
difference in behavior is inserting bound values for Params, but it will
be interesting to look at other possibilities.  One that we've seen
come up repeatedly is reducing now() and related functions to current
values, so that queries like ... WHERE timestampcol > now() - '1 day'
have some chance of being planned effectively.

Oliver Jowett, with some kibitzing from Tom Lane.
2004-06-11 01:09:22 +00:00
Tom Lane
04226b6404 Tweak planner so that index expressions and predicates are matched to
queries without regard to whether coercions are stated explicitly or
implicitly.  Per suggestion from Stephan Szabo.
2004-03-14 23:41:27 +00:00
Tom Lane
5c74ce23db Improve UniquePath logic to detect the case where the input is already
known unique (eg, it is a SELECT DISTINCT ... subquery), and not do a
redundant unique-ification step.
2004-01-05 18:04:39 +00:00
Tom Lane
82b4dd394f Merge restrictlist_selectivity into clauselist_selectivity by
teaching the latter to accept either RestrictInfo nodes or bare
clause expressions; and cache the selectivity result in the RestrictInfo
node when possible.  This extends the caching behavior of approx_selectivity
to many more contexts, and should reduce duplicate selectivity
calculations.
2004-01-04 03:51:52 +00:00
Tom Lane
be6c38b903 Adjust the definition of RestrictInfo's left_relids and right_relids
fields: now they are valid whenever the clause is a binary opclause,
not only when it is a potential join clause (there is a new boolean
field canjoin to signal the latter condition).  This lets us avoid
recomputing the relid sets over and over while examining indexes.
Still more work to do to make this as useful as it could be, because
there are places that could use the info but don't have access to the
RestrictInfo node.
2003-12-30 23:53:15 +00:00
PostgreSQL Daemon
55b113257c make sure the $Id tags are converted to $PostgreSQL as well ... 2003-11-29 22:41:33 +00:00
Bruce Momjian
46785776c4 Another pgindent run with updated typedefs. 2003-08-08 21:42:59 +00:00
Bruce Momjian
f3c3deb7d0 Update copyrights to 2003. 2003-08-04 02:40:20 +00:00
Bruce Momjian
089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Bruce Momjian
111d8e522b Back out array mega-patch.
Joe Conway
2003-06-25 21:30:34 +00:00
Bruce Momjian
46bf651480 Array mega-patch.
Joe Conway
2003-06-24 23:14:49 +00:00
Tom Lane
fc8d970cbc Replace functional-index facility with expressional indexes. Any column
of an index can now be a computed expression instead of a simple variable.
Restrictions on expressions are the same as for predicates (only immutable
functions, no sub-selects).  This fixes problems recently introduced with
inlining SQL functions, because the inlining transformation is applied to
both expression trees so the planner can still match them up.  Along the
way, improve efficiency of handling index predicates (both predicates and
index expressions are now cached by the relcache) and fix 7.3 oversight
that didn't record dependencies of predicate expressions.
2003-05-28 16:04:02 +00:00
Tom Lane
2d1f940542 Minor code cleanup: remove no-longer-useful pull_subplans() function,
and convert pull_agg_clause() into count_agg_clause(), which is a more
efficient way of doing what it's really being used for.
2003-02-04 00:50:01 +00:00
Tom Lane
b19adc1aae Fix parse_agg.c to detect ungrouped Vars in sub-SELECTs; remove code
that used to do it in planner.  That was an ancient kluge that was
never satisfactory; errors should be detected at parse time when possible.
But at the time we didn't have the support mechanism (expression_tree_walker
et al) to make it convenient to do in the parser.
2003-01-17 03:25:04 +00:00
Tom Lane
a4d82dd4b4 Adjust API of expression_tree_mutator and query_tree_mutator to
simplify callers.  It turns out the common case is that the caller
does want to recurse into sub-queries, so push support for that into
these subroutines.
2003-01-17 02:01:21 +00:00
Tom Lane
de97072e3c Allow merge and hash joins to occur on arbitrary expressions (anything not
containing a volatile function), rather than only on 'Var = Var' clauses
as before.  This makes it practical to do flatten_join_alias_vars at the
start of planning, which in turn eliminates a bunch of klugery inside the
planner to deal with alias vars.  As a free side effect, we now detect
implied equality of non-Var expressions; for example in
	SELECT ... WHERE a.x = b.y and b.y = 42
we will deduce a.x = 42 and use that as a restriction qual on a.  Also,
we can remove the restriction introduced 12/5/02 to prevent pullup of
subqueries whose targetlists contain sublinks.
Still TODO: make statistical estimation routines in selfuncs.c and costsize.c
smarter about expressions that are more complex than plain Vars.  The need
for this is considerably greater now that we have to be able to estimate
the suitability of merge and hash join techniques on such expressions.
2003-01-15 19:35:48 +00:00
Tom Lane
2d8d66628a Clean up plantree representation of SubPlan-s --- SubLink does not appear
in the planned representation of a subplan at all any more, only SubPlan.
This means subselect.c doesn't scribble on its input anymore, which seems
like a good thing; and there are no longer three different possible
interpretations of a SubLink.  Simplify node naming and improve comments
in primnodes.h.  No change to stored rules, though.
2002-12-14 00:17:59 +00:00
Tom Lane
a0bf885f9e Phase 2 of read-only-plans project: restructure expression-tree nodes
so that all executable expression nodes inherit from a common supertype
Expr.  This is somewhat of an exercise in code purity rather than any
real functional advance, but getting rid of the extra Oper or Func node
formerly used in each operator or function call should provide at least
a little space and speed improvement.
initdb forced by changes in stored-rules representation.
2002-12-12 15:49:42 +00:00
Tom Lane
8e3a87fbd4 Teach planner to expand sufficiently simple SQL-language functions
('SELECT expression') inline, like macros, during the constant-folding
phase of planning.  The actual expansion is not difficult, but checking
that we're not changing the semantics of the call turns out to be more
subtle than one might think; in particular must pay attention to
permissions issues, strictness, and volatility.
2002-12-01 21:05:14 +00:00
Tom Lane
2103b7baa2 Phase 2 of hashed-aggregation project. nodeAgg.c now knows how to do
hashed aggregation, but there's not yet planner support for it.
2002-11-06 22:31:24 +00:00
Tom Lane
6fdc44be71 Tweak querytree-dependency-extraction code so that columns of tables
that are explicitly JOINed are not considered dependencies unless they
are actually used in the query: mere presence in the joinaliasvars
list of a JOIN RTE doesn't count as being used.  The patch touches
a number of files because I needed to generalize the API of
query_tree_walker to support an additional flag bit, but the changes
are otherwise quite small.
2002-09-11 14:48:55 +00:00
Bruce Momjian
d84fe82230 Update copyright to 2002. 2002-06-20 20:29:54 +00:00
Tom Lane
3389a110d4 Get rid of long-since-vestigial Iter node type, in favor of adding a
returns-set boolean field in Func and Oper nodes.  This allows cleaner,
more reliable tests for expressions returning sets in the planner and
parser.  For example, a WHERE clause returning a set is now detected
and complained of in the parser, not only at runtime.
2002-05-12 23:43:04 +00:00
Tom Lane
4bdb4be62e Divide functions into three volatility classes (immutable, stable, and
volatile), rather than the old cachable/noncachable distinction.  This
allows indexscan optimizations in many places where we formerly didn't.
Also, add a pronamespace column to pg_proc (it doesn't do anything yet,
however).
2002-04-05 00:31:36 +00:00
Tom Lane
63cc56de54 Suppress subquery pullup and pushdown when the subquery has any
set-returning functions in its target list.  This ensures that we
won't rewrite the query in a way that places set-returning functions
into quals (WHERE clauses).  Cf. bug reports from Joe Conway.
2001-12-10 22:54:12 +00:00
Bruce Momjian
ea08e6cd55 New pgindent run with fixes suggested by Tom. Patch manually reviewed,
initdb/regression tests pass.
2001-11-05 17:46:40 +00:00
Tom Lane
96ca8ffebc Fix problems with subselects used in GROUP BY expressions, per gripe
from Philip Warner.  Side effect of change is that GROUP BY expressions
will not be re-evaluated at multiple plan levels anymore, whereas this
sometimes happened with old code.
2001-10-30 19:58:58 +00:00
Bruce Momjian
6783b2372e Another pgindent run. Fixes enum indenting, and improves #endif
spacing.  Also adds space for one-line comments.
2001-10-28 06:26:15 +00:00