postgresql/src/include/optimizer/optimizer.h

205 lines
7 KiB
C
Raw Normal View History

/*-------------------------------------------------------------------------
*
* optimizer.h
* External API for the Postgres planner.
*
* This header is meant to define everything that the core planner
* exposes for use by non-planner modules.
*
* Note that there are files outside src/backend/optimizer/ that are
* considered planner modules, because they're too much in bed with
* planner operations to be treated otherwise. FDW planning code is an
* example. For the most part, however, code outside the core planner
* should not need to include any optimizer/ header except this one.
*
* Portions Copyright (c) 1996-2021, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* src/include/optimizer/optimizer.h
*
*-------------------------------------------------------------------------
*/
#ifndef OPTIMIZER_H
#define OPTIMIZER_H
#include "nodes/parsenodes.h"
/* Test if an expression node represents a SRF call. Beware multiple eval! */
#define IS_SRF_CALL(node) \
((IsA(node, FuncExpr) && ((FuncExpr *) (node))->funcretset) || \
(IsA(node, OpExpr) && ((OpExpr *) (node))->opretset))
/*
* We don't want to include nodes/pathnodes.h here, because non-planner
* code should generally treat PlannerInfo as an opaque typedef.
* But we'd like such code to use that typedef name, so define the
* typedef either here or in pathnodes.h, whichever is read first.
*/
#ifndef HAVE_PLANNERINFO_TYPEDEF
typedef struct PlannerInfo PlannerInfo;
#define HAVE_PLANNERINFO_TYPEDEF 1
#endif
/* Likewise for IndexOptInfo and SpecialJoinInfo. */
#ifndef HAVE_INDEXOPTINFO_TYPEDEF
typedef struct IndexOptInfo IndexOptInfo;
#define HAVE_INDEXOPTINFO_TYPEDEF 1
#endif
#ifndef HAVE_SPECIALJOININFO_TYPEDEF
typedef struct SpecialJoinInfo SpecialJoinInfo;
#define HAVE_SPECIALJOININFO_TYPEDEF 1
#endif
/* It also seems best not to include plannodes.h, params.h, or htup.h here */
struct PlannedStmt;
struct ParamListInfoData;
struct HeapTupleData;
/* in path/clausesel.c: */
extern Selectivity clause_selectivity(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
SpecialJoinInfo *sjinfo);
Improve estimation of OR clauses using extended statistics. Formerly we only applied extended statistics to an OR clause as part of the clauselist_selectivity() code path for an OR clause appearing in an implicitly-ANDed list of clauses. This meant that it could only use extended statistics if all sub-clauses of the OR clause were covered by a single extended statistics object. Instead, teach clause_selectivity() how to apply extended statistics to an OR clause by handling its ORed list of sub-clauses in a similar manner to an implicitly-ANDed list of sub-clauses, but with different combination rules. This allows one or more extended statistics objects to be used to estimate all or part of the list of sub-clauses. Any remaining sub-clauses are then treated as if they are independent. Additionally, to avoid double-application of extended statistics, this introduces "extended" versions of clause_selectivity() and clauselist_selectivity(), which include an option to ignore extended statistics. This replaces the old clauselist_selectivity_simple() function which failed to completely ignore extended statistics when called from the extended statistics code. A known limitation of the current infrastructure is that an AND clause under an OR clause is not treated as compatible with extended statistics (because we don't build RestrictInfos for such sub-AND clauses). Thus, for example, "(a=1 AND b=1) OR (a=2 AND b=2)" will currently be treated as two independent AND clauses (each of which may be estimated using extended statistics), but extended statistics will not currently be used to account for any possible overlap between those clauses. Improving that is left as a task for the future. Original patch by Tomas Vondra, with additional improvements by me. Discussion: https://postgr.es/m/20200113230008.g67iyk4cs3xbnjju@development
2020-12-03 05:03:49 -05:00
extern Selectivity clause_selectivity_ext(PlannerInfo *root,
Node *clause,
int varRelid,
JoinType jointype,
SpecialJoinInfo *sjinfo,
bool use_extended_stats);
extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
SpecialJoinInfo *sjinfo);
Improve estimation of OR clauses using extended statistics. Formerly we only applied extended statistics to an OR clause as part of the clauselist_selectivity() code path for an OR clause appearing in an implicitly-ANDed list of clauses. This meant that it could only use extended statistics if all sub-clauses of the OR clause were covered by a single extended statistics object. Instead, teach clause_selectivity() how to apply extended statistics to an OR clause by handling its ORed list of sub-clauses in a similar manner to an implicitly-ANDed list of sub-clauses, but with different combination rules. This allows one or more extended statistics objects to be used to estimate all or part of the list of sub-clauses. Any remaining sub-clauses are then treated as if they are independent. Additionally, to avoid double-application of extended statistics, this introduces "extended" versions of clause_selectivity() and clauselist_selectivity(), which include an option to ignore extended statistics. This replaces the old clauselist_selectivity_simple() function which failed to completely ignore extended statistics when called from the extended statistics code. A known limitation of the current infrastructure is that an AND clause under an OR clause is not treated as compatible with extended statistics (because we don't build RestrictInfos for such sub-AND clauses). Thus, for example, "(a=1 AND b=1) OR (a=2 AND b=2)" will currently be treated as two independent AND clauses (each of which may be estimated using extended statistics), but extended statistics will not currently be used to account for any possible overlap between those clauses. Improving that is left as a task for the future. Original patch by Tomas Vondra, with additional improvements by me. Discussion: https://postgr.es/m/20200113230008.g67iyk4cs3xbnjju@development
2020-12-03 05:03:49 -05:00
extern Selectivity clauselist_selectivity_ext(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
SpecialJoinInfo *sjinfo,
bool use_extended_stats);
/* in path/costsize.c: */
/* widely used cost parameters */
extern PGDLLIMPORT double seq_page_cost;
extern PGDLLIMPORT double random_page_cost;
extern PGDLLIMPORT double cpu_tuple_cost;
extern PGDLLIMPORT double cpu_index_tuple_cost;
extern PGDLLIMPORT double cpu_operator_cost;
extern PGDLLIMPORT double parallel_tuple_cost;
extern PGDLLIMPORT double parallel_setup_cost;
extern PGDLLIMPORT int effective_cache_size;
extern double clamp_row_est(double nrows);
/* in path/indxpath.c: */
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar. Previously, pull_varnos() took the relids of a PlaceHolderVar as being equal to the relids in its contents, but that fails to account for the possibility that we have to postpone evaluation of the PHV due to outer joins. This could result in a malformed plan. The known cases end up triggering the "failed to assign all NestLoopParams to plan nodes" sanity check in createplan.c, but other symptoms may be possible. The right value to use is the join level we actually intend to evaluate the PHV at. We can get that from the ph_eval_at field of the associated PlaceHolderInfo. However, there are some places that call pull_varnos() before the PlaceHolderInfos have been created; in that case, fall back to the conservative assumption that the PHV will be evaluated at its syntactic level. (In principle this might result in missing some legal optimization, but I'm not aware of any cases where it's an issue in practice.) Things are also a bit ticklish for calls occurring during deconstruct_jointree(), but AFAICS the ph_eval_at fields should have reached their final values by the time we need them. The main problem in making this work is that pull_varnos() has no way to get at the PlaceHolderInfos. We can fix that easily, if a bit tediously, in HEAD by passing it the planner "root" pointer. In the back branches that'd cause an unacceptable API/ABI break for extensions, so leave the existing entry points alone and add new ones with the additional parameter. (If an old entry point is called and encounters a PHV, it'll fall back to using the syntactic level, again possibly missing some valid optimization.) Back-patch to v12. The computation is surely also wrong before that, but it appears that we cannot reach a bad plan thanks to join order restrictions imposed on the subquery that the PlaceHolderVar came from. The error only became reachable when commit 4be058fe9 allowed trivial subqueries to be collapsed out completely, eliminating their join order restrictions. Per report from Stephan Springl. Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 15:37:23 -05:00
extern bool is_pseudo_constant_for_index(PlannerInfo *root, Node *expr,
IndexOptInfo *index);
/* in plan/planner.c: */
/* possible values for force_parallel_mode */
typedef enum
{
FORCE_PARALLEL_OFF,
FORCE_PARALLEL_ON,
FORCE_PARALLEL_REGRESS
} ForceParallelMode;
/* GUC parameters */
extern int force_parallel_mode;
extern bool parallel_leader_participation;
extern struct PlannedStmt *planner(Query *parse, const char *query_string,
int cursorOptions,
struct ParamListInfoData *boundParams);
extern Expr *expression_planner(Expr *expr);
extern Expr *expression_planner_with_deps(Expr *expr,
List **relationOids,
List **invalItems);
extern bool plan_cluster_use_sort(Oid tableOid, Oid indexOid);
extern int plan_create_index_workers(Oid tableOid, Oid indexOid);
/* in plan/setrefs.c: */
extern void extract_query_dependencies(Node *query,
List **relationOids,
List **invalItems,
bool *hasRowSecurity);
/* in prep/prepqual.c: */
extern Node *negate_clause(Node *node);
extern Expr *canonicalize_qual(Expr *qual, bool is_check);
/* in util/clauses.c: */
extern bool contain_mutable_functions(Node *clause);
extern bool contain_volatile_functions(Node *clause);
extern bool contain_volatile_functions_not_nextval(Node *clause);
extern Node *eval_const_expressions(PlannerInfo *root, Node *node);
Speedup ScalarArrayOpExpr evaluation ScalarArrayOpExprs with "useOr=true" and a set of Consts on the righthand side have traditionally been evaluated by using a linear search over the array. When these arrays contain large numbers of elements then this linear search could become a significant part of execution time. Here we add a new method of evaluating ScalarArrayOpExpr expressions to allow them to be evaluated by first building a hash table containing each element, then on subsequent evaluations, we just probe that hash table to determine if there is a match. The planner is in charge of determining when this optimization is possible and it enables it by setting hashfuncid in the ScalarArrayOpExpr. The executor will only perform the hash table evaluation when the hashfuncid is set. This means that not all cases are optimized. For example CHECK constraints containing an IN clause won't go through the planner, so won't get the hashfuncid set. We could maybe do something about that at some later date. The reason we're not doing it now is from fear that we may slow down cases where the expression is evaluated only once. Those cases can be common, for example, a single row INSERT to a table with a CHECK constraint containing an IN clause. In the planner, we enable this when there are suitable hash functions for the ScalarArrayOpExpr's operator and only when there is at least MIN_ARRAY_SIZE_FOR_HASHED_SAOP elements in the array. The threshold is currently set to 9. Author: James Coleman, David Rowley Reviewed-by: David Rowley, Tomas Vondra, Heikki Linnakangas Discussion: https://postgr.es/m/CAAaqYe8x62+=wn0zvNKCj55tPpg-JBHzhZFFc6ANovdqFw7-dA@mail.gmail.com
2021-04-08 07:51:22 -04:00
extern void convert_saop_to_hashed_saop(Node *node);
extern Node *estimate_expression_value(PlannerInfo *root, Node *node);
extern Expr *evaluate_expr(Expr *expr, Oid result_type, int32 result_typmod,
Oid result_collation);
extern List *expand_function_arguments(List *args, Oid result_type,
struct HeapTupleData *func_tuple);
/* in util/predtest.c: */
extern bool predicate_implied_by(List *predicate_list, List *clause_list,
bool weak);
extern bool predicate_refuted_by(List *predicate_list, List *clause_list,
bool weak);
/* in util/tlist.c: */
extern int count_nonjunk_tlist_entries(List *tlist);
extern TargetEntry *get_sortgroupref_tle(Index sortref,
List *targetList);
extern TargetEntry *get_sortgroupclause_tle(SortGroupClause *sgClause,
List *targetList);
extern Node *get_sortgroupclause_expr(SortGroupClause *sgClause,
List *targetList);
extern List *get_sortgrouplist_exprs(List *sgClauses,
List *targetList);
extern SortGroupClause *get_sortgroupref_clause(Index sortref,
List *clauses);
extern SortGroupClause *get_sortgroupref_clause_noerr(Index sortref,
List *clauses);
/* in util/var.c: */
/* Bits that can be OR'd into the flags argument of pull_var_clause() */
#define PVC_INCLUDE_AGGREGATES 0x0001 /* include Aggrefs in output list */
#define PVC_RECURSE_AGGREGATES 0x0002 /* recurse into Aggref arguments */
#define PVC_INCLUDE_WINDOWFUNCS 0x0004 /* include WindowFuncs in output list */
#define PVC_RECURSE_WINDOWFUNCS 0x0008 /* recurse into WindowFunc arguments */
#define PVC_INCLUDE_PLACEHOLDERS 0x0010 /* include PlaceHolderVars in
* output list */
#define PVC_RECURSE_PLACEHOLDERS 0x0020 /* recurse into PlaceHolderVar
* arguments */
Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar. Previously, pull_varnos() took the relids of a PlaceHolderVar as being equal to the relids in its contents, but that fails to account for the possibility that we have to postpone evaluation of the PHV due to outer joins. This could result in a malformed plan. The known cases end up triggering the "failed to assign all NestLoopParams to plan nodes" sanity check in createplan.c, but other symptoms may be possible. The right value to use is the join level we actually intend to evaluate the PHV at. We can get that from the ph_eval_at field of the associated PlaceHolderInfo. However, there are some places that call pull_varnos() before the PlaceHolderInfos have been created; in that case, fall back to the conservative assumption that the PHV will be evaluated at its syntactic level. (In principle this might result in missing some legal optimization, but I'm not aware of any cases where it's an issue in practice.) Things are also a bit ticklish for calls occurring during deconstruct_jointree(), but AFAICS the ph_eval_at fields should have reached their final values by the time we need them. The main problem in making this work is that pull_varnos() has no way to get at the PlaceHolderInfos. We can fix that easily, if a bit tediously, in HEAD by passing it the planner "root" pointer. In the back branches that'd cause an unacceptable API/ABI break for extensions, so leave the existing entry points alone and add new ones with the additional parameter. (If an old entry point is called and encounters a PHV, it'll fall back to using the syntactic level, again possibly missing some valid optimization.) Back-patch to v12. The computation is surely also wrong before that, but it appears that we cannot reach a bad plan thanks to join order restrictions imposed on the subquery that the PlaceHolderVar came from. The error only became reachable when commit 4be058fe9 allowed trivial subqueries to be collapsed out completely, eliminating their join order restrictions. Per report from Stephan Springl. Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 15:37:23 -05:00
extern Bitmapset *pull_varnos(PlannerInfo *root, Node *node);
extern Bitmapset *pull_varnos_of_level(PlannerInfo *root, Node *node, int levelsup);
extern void pull_varattnos(Node *node, Index varno, Bitmapset **varattnos);
extern List *pull_vars_of_level(Node *node, int levelsup);
extern bool contain_var_clause(Node *node);
extern bool contain_vars_of_level(Node *node, int levelsup);
extern int locate_var_of_level(Node *node, int levelsup);
extern List *pull_var_clause(Node *node, int flags);
extern Node *flatten_join_alias_vars(Query *query, Node *node);
#endif /* OPTIMIZER_H */