mirror of
https://github.com/postgres/postgres.git
synced 2026-02-16 00:57:52 -05:00
Eager aggregation is a query optimization technique that partially pushes aggregation past a join, and finalizes it once all the relations are joined. Eager aggregation may reduce the number of input rows to the join and thus could result in a better overall plan. In the current planner architecture, the separation between the scan/join planning phase and the post-scan/join phase means that aggregation steps are not visible when constructing the join tree, limiting the planner's ability to exploit aggregation-aware optimizations. To implement eager aggregation, we collect information about aggregate functions in the targetlist and HAVING clause, along with grouping expressions from the GROUP BY clause, and store it in the PlannerInfo node. During the scan/join planning phase, this information is used to evaluate each base or join relation to determine whether eager aggregation can be applied. If applicable, we create a separate RelOptInfo, referred to as a grouped relation, to represent the partially-aggregated version of the relation and generate grouped paths for it. Grouped relation paths can be generated in two ways. The first method involves adding sorted and hashed partial aggregation paths on top of the non-grouped paths. To limit planning time, we only consider the cheapest or suitably-sorted non-grouped paths in this step. Alternatively, grouped paths can be generated by joining a grouped relation with a non-grouped relation. Joining two grouped relations is currently not supported. To further limit planning time, we currently adopt a strategy where partial aggregation is pushed only to the lowest feasible level in the join tree where it provides a significant reduction in row count. This strategy also helps ensure that all grouped paths for the same grouped relation produce the same set of rows, which is important to support a fundamental assumption of the planner. For the partial aggregation that is pushed down to a non-aggregated relation, we need to consider all expressions from this relation that are involved in upper join clauses and include them in the grouping keys, using compatible operators. This is essential to ensure that an aggregated row from the partial aggregation matches the other side of the join if and only if each row in the partial group does. This ensures that all rows within the same partial group share the same "destiny", which is crucial for maintaining correctness. One restriction is that we cannot push partial aggregation down to a relation that is in the nullable side of an outer join, because the NULL-extended rows produced by the outer join would not be available when we perform the partial aggregation, while with a non-eager-aggregation plan these rows are available for the top-level aggregation. Pushing partial aggregation in this case may result in the rows being grouped differently than expected, or produce incorrect values from the aggregate functions. If we have generated a grouped relation for the topmost join relation, we finalize its paths at the end. The final paths will compete in the usual way with paths built from regular planning. The patch was originally proposed by Antonin Houska in 2017. This commit reworks various important aspects and rewrites most of the current code. However, the original patch and reviews were very useful. Author: Richard Guo <guofenglinux@gmail.com> Author: Antonin Houska <ah@cybertec.at> (in an older version) Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> (in an older version) Reviewed-by: Andy Fan <zhihuifan1213@163.com> (in an older version) Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> (in an older version) Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
135 lines
5.3 KiB
C
135 lines
5.3 KiB
C
/*-------------------------------------------------------------------------
|
|
*
|
|
* planmain.h
|
|
* prototypes for various files in optimizer/plan
|
|
*
|
|
*
|
|
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
*
|
|
* src/include/optimizer/planmain.h
|
|
*
|
|
*-------------------------------------------------------------------------
|
|
*/
|
|
#ifndef PLANMAIN_H
|
|
#define PLANMAIN_H
|
|
|
|
#include "nodes/pathnodes.h"
|
|
#include "nodes/plannodes.h"
|
|
|
|
/* GUC parameters */
|
|
#define DEFAULT_CURSOR_TUPLE_FRACTION 0.1
|
|
extern PGDLLIMPORT double cursor_tuple_fraction;
|
|
extern PGDLLIMPORT bool enable_self_join_elimination;
|
|
|
|
/* query_planner callback to compute query_pathkeys */
|
|
typedef void (*query_pathkeys_callback) (PlannerInfo *root, void *extra);
|
|
|
|
/*
|
|
* prototypes for plan/planmain.c
|
|
*/
|
|
extern RelOptInfo *query_planner(PlannerInfo *root,
|
|
query_pathkeys_callback qp_callback, void *qp_extra);
|
|
|
|
/*
|
|
* prototypes for plan/planagg.c
|
|
*/
|
|
extern void preprocess_minmax_aggregates(PlannerInfo *root);
|
|
|
|
/*
|
|
* prototypes for plan/createplan.c
|
|
*/
|
|
extern Plan *create_plan(PlannerInfo *root, Path *best_path);
|
|
extern ForeignScan *make_foreignscan(List *qptlist, List *qpqual,
|
|
Index scanrelid, List *fdw_exprs, List *fdw_private,
|
|
List *fdw_scan_tlist, List *fdw_recheck_quals,
|
|
Plan *outer_plan);
|
|
extern Plan *change_plan_targetlist(Plan *subplan, List *tlist,
|
|
bool tlist_parallel_safe);
|
|
extern Plan *materialize_finished_plan(Plan *subplan);
|
|
extern bool is_projection_capable_path(Path *path);
|
|
extern bool is_projection_capable_plan(Plan *plan);
|
|
|
|
/* External use of these functions is deprecated: */
|
|
extern Sort *make_sort_from_sortclauses(List *sortcls, Plan *lefttree);
|
|
extern Agg *make_agg(List *tlist, List *qual,
|
|
AggStrategy aggstrategy, AggSplit aggsplit,
|
|
int numGroupCols, AttrNumber *grpColIdx, Oid *grpOperators, Oid *grpCollations,
|
|
List *groupingSets, List *chain, double dNumGroups,
|
|
Size transitionSpace, Plan *lefttree);
|
|
extern Limit *make_limit(Plan *lefttree, Node *limitOffset, Node *limitCount,
|
|
LimitOption limitOption, int uniqNumCols,
|
|
AttrNumber *uniqColIdx, Oid *uniqOperators,
|
|
Oid *uniqCollations);
|
|
|
|
/*
|
|
* prototypes for plan/initsplan.c
|
|
*/
|
|
extern PGDLLIMPORT int from_collapse_limit;
|
|
extern PGDLLIMPORT int join_collapse_limit;
|
|
|
|
extern void add_base_rels_to_query(PlannerInfo *root, Node *jtnode);
|
|
extern void add_other_rels_to_query(PlannerInfo *root);
|
|
extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
|
|
extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
|
|
Relids where_needed);
|
|
extern void add_vars_to_attr_needed(PlannerInfo *root, List *vars,
|
|
Relids where_needed);
|
|
extern void remove_useless_groupby_columns(PlannerInfo *root);
|
|
extern void setup_eager_aggregation(PlannerInfo *root);
|
|
extern void find_lateral_references(PlannerInfo *root);
|
|
extern void rebuild_lateral_attr_needed(PlannerInfo *root);
|
|
extern void create_lateral_join_info(PlannerInfo *root);
|
|
extern List *deconstruct_jointree(PlannerInfo *root);
|
|
extern bool restriction_is_always_true(PlannerInfo *root,
|
|
RestrictInfo *restrictinfo);
|
|
extern bool restriction_is_always_false(PlannerInfo *root,
|
|
RestrictInfo *restrictinfo);
|
|
extern void distribute_restrictinfo_to_rels(PlannerInfo *root,
|
|
RestrictInfo *restrictinfo);
|
|
extern RestrictInfo *process_implied_equality(PlannerInfo *root,
|
|
Oid opno,
|
|
Oid collation,
|
|
Expr *item1,
|
|
Expr *item2,
|
|
Relids qualscope,
|
|
Index security_level,
|
|
bool both_const);
|
|
extern RestrictInfo *build_implied_join_equality(PlannerInfo *root,
|
|
Oid opno,
|
|
Oid collation,
|
|
Expr *item1,
|
|
Expr *item2,
|
|
Relids qualscope,
|
|
Index security_level);
|
|
extern void rebuild_joinclause_attr_needed(PlannerInfo *root);
|
|
extern void match_foreign_keys_to_quals(PlannerInfo *root);
|
|
|
|
/*
|
|
* prototypes for plan/analyzejoins.c
|
|
*/
|
|
extern List *remove_useless_joins(PlannerInfo *root, List *joinlist);
|
|
extern void reduce_unique_semijoins(PlannerInfo *root);
|
|
extern bool query_supports_distinctness(Query *query);
|
|
extern bool query_is_distinct_for(Query *query, List *colnos, List *opids);
|
|
extern bool innerrel_is_unique(PlannerInfo *root,
|
|
Relids joinrelids, Relids outerrelids, RelOptInfo *innerrel,
|
|
JoinType jointype, List *restrictlist, bool force_cache);
|
|
extern bool innerrel_is_unique_ext(PlannerInfo *root, Relids joinrelids,
|
|
Relids outerrelids, RelOptInfo *innerrel,
|
|
JoinType jointype, List *restrictlist,
|
|
bool force_cache, List **extra_clauses);
|
|
extern List *remove_useless_self_joins(PlannerInfo *root, List *joinlist);
|
|
|
|
/*
|
|
* prototypes for plan/setrefs.c
|
|
*/
|
|
extern Plan *set_plan_references(PlannerInfo *root, Plan *plan);
|
|
extern bool trivial_subqueryscan(SubqueryScan *plan);
|
|
extern Param *find_minmax_agg_replacement_param(PlannerInfo *root,
|
|
Aggref *aggref);
|
|
extern void record_plan_function_dependency(PlannerInfo *root, Oid funcid);
|
|
extern void record_plan_type_dependency(PlannerInfo *root, Oid typid);
|
|
extern bool extract_query_dependencies_walker(Node *node, PlannerInfo *context);
|
|
|
|
#endif /* PLANMAIN_H */
|