postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-02-28 12:20:43 -05:00

Author	SHA1	Message	Date
Robert Haas	45be99f8cd	Support parallel joins, and make related improvements. The core innovation of this patch is the introduction of the concept of a partial path; that is, a path which if executed in parallel will generate a subset of the output rows in each process. Gathering a partial path produces an ordinary (complete) path. This allows us to generate paths for parallel joins by joining a partial path for one side (which at the baserel level is currently always a Partial Seq Scan) to an ordinary path on the other side. This is subject to various restrictions at present, especially that this strategy seems unlikely to be sensible for merge joins, so only nested loops and hash joins paths are generated. This also allows an Append node to be pushed below a Gather node in the case of a partitioned table. Testing revealed that early versions of this patch made poor decisions in some cases, which turned out to be caused by the fact that the original cost model for Parallel Seq Scan wasn't very good. So this patch tries to make some modest improvements in that area. There is much more to be done in the area of generating good parallel plans in all cases, but this seems like a useful step forward. Patch by me, reviewed by Dilip Kumar and Amit Kapila.	2016-01-20 14:40:26 -05:00
Bruce Momjian	ee94300446	Update copyright for 2016 Backpatch certain files through 9.1	2016-01-02 13:33:40 -05:00
Tom Lane	4fcf48450d	Get rid of the planner's LateralJoinInfo data structure. I originally modeled this data structure on SpecialJoinInfo, but after commit `acfcd45cac` that looks like a pretty poor decision. All we really need is relid sets identifying laterally-referenced rels; and most of the time, what we want to know about includes indirect lateral references, a case the LateralJoinInfo data was unsuited to compute with any efficiency. The previous commit redefined RelOptInfo.lateral_relids as the transitive closure of lateral references, so that it easily supports checking indirect references. For the places where we really do want just direct references, add a new RelOptInfo field direct_lateral_relids, which is easily set up as a copy of lateral_relids before we perform the transitive closure calculation. Then we can just drop lateral_info_list and LateralJoinInfo and the supporting code. This makes the planner's handling of lateral references noticeably more efficient, and shorter too. Such a change can't be back-patched into stable branches for fear of breaking extensions that might be looking at the planner's data structures; but it seems not too late to push it into 9.5, so I've done so.	2015-12-11 15:52:38 -05:00
Tom Lane	acfcd45cac	Still more fixes for planner's handling of LATERAL references. More fuzz testing by Andreas Seltenreich exposed that the planner did not cope well with chains of lateral references. If relation X references Y laterally, and Y references Z laterally, then we will have to scan X on the inside of a nestloop with Z, so for all intents and purposes X is laterally dependent on Z too. The planner did not understand this and would generate intermediate joins that could not be used. While that was usually harmless except for wasting some planning cycles, under the right circumstances it would lead to "failed to build any N-way joins" or "could not devise a query plan" planner failures. To fix that, convert the existing per-relation lateral_relids and lateral_referencers relid sets into their transitive closures; that is, they now show all relations on which a rel is directly or indirectly laterally dependent. This not only fixes the chained-reference problem but allows some of the relevant tests to be made substantially simpler and faster, since they can be reduced to simple bitmap manipulations instead of searches of the LateralJoinInfo list. Also, when a PlaceHolderVar that is due to be evaluated at a join contains lateral references, we should treat those references as indirect lateral dependencies of each of the join's base relations. This prevents us from trying to join any individual base relations to the lateral reference source before the join is formed, which again cannot work. Andreas' testing also exposed another oversight in the "dangerous PlaceHolderVar" test added in commit `85e5e222b1`. Simply rejecting unsafe join paths in joinpath.c is insufficient, because in some cases we will end up rejecting all possible paths for a particular join, again leading to "could not devise a query plan" failures. The restriction has to be known also to join_is_legal and its cohort functions, so that they will not select a join for which that will happen. I chose to move the supporting logic into joinrels.c where the latter functions are. Back-patch to 9.3 where LATERAL support was introduced.	2015-12-11 14:22:20 -05:00
Tom Lane	7e19db0c09	Fix another oversight in checking if a join with LATERAL refs is legal. It was possible for the planner to decide to join a LATERAL subquery to the outer side of an outer join before the outer join itself is completed. Normally that's fine because of the associativity rules, but it doesn't work if the subquery contains a lateral reference to the inner side of the outer join. In such a situation the outer join must be done first. join_is_legal() missed this consideration and would allow the join to be attempted, but the actual path-building code correctly decided that no valid join path could be made, sometimes leading to planner errors such as "failed to build any N-way joins". Per report from Andreas Seltenreich. Back-patch to 9.3 where LATERAL support was added.	2015-12-07 17:42:11 -05:00
Tom Lane	cfe30a72fa	Undo mistaken tightening in join_is_legal(). One of the changes I made in commit `8703059c6b` turns out not to have been such a good idea: we still need the exception in join_is_legal() that allows a join if both inputs already overlap the RHS of the special join we're checking. Otherwise we can miss valid plans, and might indeed fail to find a plan at all, as in recent report from Andreas Seltenreich. That code was added way back in commit `c17117649b`, but I failed to include a regression test case then; my bad. Put it back with a better explanation, and a test this time. The logic does end up a bit different than before though: I now believe it's appropriate to make this check first, thereby allowing such a case whether or not we'd consider the previous SJ(s) to commute with this one. (Presumably, we already decided they did; but it was confusing to have this consideration in the middle of the code that was handling the other case.) Back-patch to all active branches, like the previous patch.	2015-08-12 21:19:03 -04:00
Tom Lane	8703059c6b	Further fixes for degenerate outer join clauses. Further testing revealed that commit `f69b4b9495` was still a few bricks shy of a load: minor tweaking of the previous test cases resulted in the same wrong-outer-join-order problem coming back. After study I concluded that my previous changes in make_outerjoininfo() were just accidentally masking the problem, and should be reverted in favor of forcing syntactic join order whenever an upper outer join's predicate doesn't mention a lower outer join's LHS. This still allows the chained-outer-joins style that is the normally optimizable case. I also tightened things up some more in join_is_legal(). It seems to me on review that what's really happening in the exception case where we ignore a mismatched special join is that we're allowing the proposed join to associate into the RHS of the outer join we're comparing it to. As such, we should always insist that the proposed join be a left join, which eliminates a bunch of rather dubious argumentation. The case where we weren't enforcing that was the one that was already known buggy anyway (it had a violatable Assert before the aforesaid commit) so it hardly deserves a lot of deference. Back-patch to all active branches, like the previous patch. The added regression test case failed in all branches back to 9.1, and I think it's only an unrelated change in costing calculations that kept 9.0 from choosing a broken plan.	2015-08-06 15:35:46 -04:00
Tom Lane	6af9ee4c8c	Make real sure we don't reassociate joins into or out of SEMI/ANTI joins. Per the discussion in optimizer/README, it's unsafe to reassociate anything into or out of the RHS of a SEMI or ANTI join. An example from Piotr Stefaniak showed that join_is_legal() wasn't sufficiently enforcing this rule, so lock it down a little harder. I couldn't find a reasonably simple example of the optimizer trying to do this, so no new regression test. (Piotr's example involved the random search in GEQO accidentally trying an invalid case and triggering a sanity check way downstream in clause selectivity estimation, which did not seem like a sequence of events that would be useful to memorialize in a regression test as-is.) Back-patch to all active branches.	2015-08-05 14:39:29 -04:00
Tom Lane	f69b4b9495	Fix some planner issues with degenerate outer join clauses. An outer join clause that didn't actually reference the RHS (perhaps only after constant-folding) could confuse the join order enforcement logic, leading to wrong query results. Also, nested occurrences of such things could trigger an Assertion that on reflection seems incorrect. Per fuzz testing by Andreas Seltenreich. The practical use of such cases seems thin enough that it's not too surprising we've not heard field reports about it. This has been broken for a long time, so back-patch to all active branches.	2015-08-01 20:57:41 -04:00
Tom Lane	a6492ff897	Fix an oversight in checking whether a join with LATERAL refs is legal. In many cases, we can implement a semijoin as a plain innerjoin by first passing the righthand-side relation through a unique-ification step. However, one of the cases where this does NOT work is where the RHS has a LATERAL reference to the LHS; that makes the RHS dependent on the LHS so that unique-ification is meaningless. joinpath.c understood this, and so would not generate any join paths of this kind ... but join_is_legal neglected to check for the case, so it would think that we could do it. The upshot would be a "could not devise a query plan for the given query" failure once we had failed to generate any join paths at all for the bogus join pair. Back-patch to 9.3 where LATERAL was added.	2015-07-31 19:26:33 -04:00
Tom Lane	b55722692b	Improve planner's cost estimation in the presence of semijoins. If we have a semijoin, say SELECT * FROM x WHERE x1 IN (SELECT y1 FROM y) and we're estimating the cost of a parameterized indexscan on x, the number of repetitions of the indexscan should not be taken as the size of y; it'll really only be the number of distinct values of y1, because the only valid plan with y on the outside of a nestloop would require y to be unique-ified before joining it to x. Most of the time this doesn't make that much difference, but sometimes it can lead to drastically underestimating the cost of the indexscan and hence choosing a bad plan, as pointed out by David Kubečka. Fixing this is a bit difficult because parameterized indexscans are costed out quite early in the planning process, before we have the information that would be needed to call estimate_num_groups() and thereby estimate the number of distinct values of the join column(s). However we can move the code that extracts a semijoin RHS's unique-ification columns, so that it's done in initsplan.c rather than on-the-fly in create_unique_path(). That shouldn't make any difference speed-wise and it's really a bit cleaner too. The other bit of information we need is the size of the semijoin RHS, which is easy if it's a single relation (we make those estimates before considering indexscan costs) but problematic if it's a join relation. The solution adopted here is just to use the product of the sizes of the join component rels. That will generally be an overestimate, but since estimate_num_groups() only uses this input as a clamp, an overestimate shouldn't hurt us too badly. In any case we don't allow this new logic to produce a value larger than we would have chosen before, so that at worst an overestimate leaves us no wiser than we were before.	2015-03-11 21:21:00 -04:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Bruce Momjian	0a78320057	pgindent run for 9.4 This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.	2014-05-06 12:12:18 -04:00
Bruce Momjian	7e04792a1c	Update copyright for 2014 Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.	2014-01-07 16:05:30 -05:00
Tom Lane	9e7e29c75a	Fix planner problems with LATERAL references in PlaceHolderVars. The planner largely failed to consider the possibility that a PlaceHolderVar's expression might contain a lateral reference to a Var coming from somewhere outside the PHV's syntactic scope. We had a previous report of a problem in this area, which I tried to fix in a quick-hack way in commit `4da6439bd8`, but Antonin Houska pointed out that there were still some problems, and investigation turned up other issues. This patch largely reverts that commit in favor of a more thoroughly thought-through solution. The new theory is that a PHV's ph_eval_at level cannot be higher than its original syntactic level. If it contains lateral references, those don't change the ph_eval_at level, but rather they create a lateral-reference requirement for the ph_eval_at join relation. The code in joinpath.c needs to handle that. Another issue is that createplan.c wasn't handling nested PlaceHolderVars properly. In passing, push knowledge of lateral-reference checks for join clauses into join_clause_is_movable_to. This is mainly so that FDWs don't need to deal with it. This patch doesn't fix the original join-qual-placement problem reported by Jeremy Evans (and indeed, one of the new regression test cases shows the wrong answer because of that). But the PlaceHolderVar problems need to be fixed before that issue can be addressed, so committing this separately seems reasonable.	2013-08-17 20:22:37 -04:00
Bruce Momjian	bd61a623ac	Update copyrights for 2013 Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.	2013-01-01 17:15:01 -05:00
Tom Lane	9ff79b9d4e	Fix up planner infrastructure to support LATERAL properly. This patch takes care of a number of problems having to do with failure to choose valid join orders and incorrect handling of lateral references pulled up from subqueries. Notable changes: * Add a LateralJoinInfo data structure similar to SpecialJoinInfo, to represent join ordering constraints created by lateral references. (I first considered extending the SpecialJoinInfo structure, but the semantics are different enough that a separate data structure seems better.) Extend join_is_legal() and related functions to prevent trying to form unworkable joins, and to ensure that we will consider joins that satisfy lateral references even if the joins would be clauseless. * Fill in the infrastructure needed for the last few types of relation scan paths to support parameterization. We'd have wanted this eventually anyway, but it is necessary now because a relation that gets pulled up out of a UNION ALL subquery may acquire a reltargetlist containing lateral references, meaning that its paths have to be parameterized whether or not we have any code that can push join quals down into the scan. * Compute data about lateral references early in query_planner(), and save in RelOptInfo nodes, to avoid repetitive calculations later. * Assorted corner-case bug fixes. There's probably still some bugs left, but this is a lot closer to being real than it was before.	2012-08-26 22:50:23 -04:00
Tom Lane	eb919e8fde	Resurrect the "last ditch" code path in join_search_one_level(). This essentially reverts commit `e54b10a62d`, in which I'd decided that the "last ditch" join logic was useless. The folly of that is now exposed by a report from Pavel Stehule: although the function should always find at least one join in a self-contained join problem, it can still fail to do so in a sub-problem created by artificial from_collapse_limit or join_collapse_limit constraints. Adjust the comments to describe this, and simplify the code a bit to match the new coding of the earlier loop in the function. I'm not terribly happy about this: I still subscribe to the opinion stated in the previous commit message that the "last ditch" code can obscure logic bugs elsewhere. But the alternative seems to be to complicate the earlier tests for does-this-relation-have-a-join-clause to the point where they can tell whether the join clauses link outside the current join sub-problem. And that looks messy, slow, and possibly a source of bugs in itself. In any case, now is not the time to be inserting experimental code into 9.2, so let's just go back to the time-tested solution.	2012-08-15 00:08:13 -04:00
Bruce Momjian	927d61eeff	Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest.	2012-06-10 15:20:04 -04:00
Tom Lane	1f03630011	Adjust join_search_one_level's handling of clauseless joins. For an initial relation that lacks any join clauses (that is, it has to be cartesian-product-joined to the rest of the query), we considered only cartesian joins with initial rels appearing later in the initial-relations list. This creates an undesirable dependency on FROM-list order. We would never fail to find a plan, but perhaps we might not find the best available plan. Noted while discussing the logic with Amit Kapila. Improve the comments a bit in this area, too. Arguably this is a bug fix, but given the lack of complaints from the field I'll refrain from back-patching.	2012-04-20 20:10:46 -04:00
Tom Lane	5b7b5518d0	Revise parameterized-path mechanism to fix assorted issues. This patch adjusts the treatment of parameterized paths so that all paths with the same parameterization (same set of required outer rels) for the same relation will have the same rowcount estimate. We cache the rowcount estimates to ensure that property, and hopefully save a few cycles too. Doing this makes it practical for add_path_precheck to operate without a rowcount estimate: it need only assume that paths with different parameterizations never dominate each other, which is close enough to true anyway for coarse filtering, because normally a more-parameterized path should yield fewer rows thanks to having more join clauses to apply. In add_path, we do the full nine yards of comparing rowcount estimates along with everything else, so that we can discard parameterized paths that don't actually have an advantage. This fixes some issues I'd found with add_path rejecting parameterized paths on the grounds that they were more expensive than not-parameterized ones, even though they yielded many fewer rows and hence would be cheaper once subsequent joining was considered. To make the same-rowcounts assumption valid, we have to require that any parameterized path enforce all join clauses that could be obtained from the particular set of outer rels, even if not all of them are useful for indexing. This is required at both base scans and joins. It's a good thing anyway since the net impact is that join quals are checked at the lowest practical level in the join tree. Hence, discard the original rather ad-hoc mechanism for choosing parameterization joinquals, and build a better one that has a more principled rule for when clauses can be moved. The original rule was actually buggy anyway for lack of knowledge about which relations are part of an outer join's outer side; getting this right requires adding an outer_relids field to RestrictInfo.	2012-04-19 15:53:47 -04:00
Tom Lane	e54b10a62d	Remove the "last ditch" code path in join_search_one_level(). So far as I can tell, it is no longer possible for this heuristic to do anything useful, because the new weaker definition of have_relevant_joinclause means that any relation with a joinclause must be considered joinable to at least one other relation. It would still be possible for the code block to be entered, for example if there are join order restrictions that prevent any join of the current level from being formed; but in that case it's just a waste of cycles to attempt to form cartesian joins, since the restrictions will still apply. Furthermore, IMO the existence of this code path can mask bugs elsewhere; we would have noticed the problem with cartesian joins a lot sooner if this code hadn't compensated for it in the simplest case. Accordingly, let's remove it and see what happens. I'm committing this separately from the prerequisite changes in have_relevant_joinclause, just to make the question easier to revisit if there is some fault in my logic.	2012-04-13 16:07:18 -04:00
Tom Lane	e3ffd05b02	Weaken the planner's tests for relevant joinclauses. We should be willing to cross-join two small relations if that allows us to use an inner indexscan on a large relation (that is, the potential indexqual for the large table requires both smaller relations). This worked in simple cases but fell apart as soon as there was a join clause to a fourth relation, because the existence of any two-relation join clause caused the planner to not consider clauseless joins between other base relations. The added regression test shows an example case adapted from a recent complaint from Benoit Delbosc. Adjust have_relevant_joinclause, have_relevant_eclass_joinclause, and has_relevant_eclass_joinclause to consider that a join clause mentioning three or more relations is sufficient grounds for joining any subset of those relations, even if we have to do so via a cartesian join. Since such clauses are relatively uncommon, this shouldn't affect planning speed on typical queries; in fact it should help a bit, because the latter two functions in particular get significantly simpler. Although this is arguably a bug fix, I'm not going to risk back-patching it, since it might have currently-unforeseen consequences.	2012-04-13 16:07:17 -04:00
Tom Lane	e2fa76d80b	Use parameterized paths to generate inner indexscans more flexibly. This patch fixes the planner so that it can generate nestloop-with- inner-indexscan plans even with one or more levels of joining between the indexscan and the nestloop join that is supplying the parameter. The executor was fixed to handle such cases some time ago, but the planner was not ready. This should improve our plans in many situations where join ordering restrictions formerly forced complete table scans. There is probably a fair amount of tuning work yet to be done, because of various heuristics that have been added to limit the number of parameterized paths considered. However, we are not going to find out what needs to be adjusted until the code gets some real-world use, so it's time to get it in there where it can be tested easily. Note API change for index AM amcostestimate functions. I'm not aware of any non-core index AMs, but if there are any, they will need minor adjustments.	2012-01-27 19:26:38 -05:00
Bruce Momjian	e126958c2e	Update copyright notices for year 2012.	2012-01-01 18:01:58 -05:00
Bruce Momjian	6560407c7d	Pgindent run before 9.1 beta2.	2011-06-09 14:32:50 -04:00
Tom Lane	eca75a12a2	Ensure mark_dummy_rel doesn't create dangling pointers in RelOptInfos. When we are doing GEQO join planning, the current memory context is a short-lived context that will be reset at the end of geqo_eval(). However, the RelOptInfos for base relations are set up before that and then re-used across many GEQO cycles. Hence, any code that modifies a baserel during join planning has to be careful not to put pointers to the short-lived context into the baserel struct. mark_dummy_rel got this wrong, leading to easy-to-reproduce-once-you-know-how crashes in 8.4, as reported off-list by Leo Carson of SDSC. Some improvements made in 9.0 make it difficult to demonstrate the crash in 9.0 or HEAD; but there's no doubt that there's still a risk factor here, so patch all branches that have the function. (Note: 8.3 has a similar function, but it's only applied to joinrels and thus is not a hazard.)	2011-04-13 18:56:40 -04:00
Bruce Momjian	bf50caf105	pgindent run before PG 9.1 beta 1.	2011-04-10 11:42:00 -04:00
Bruce Momjian	5d950e3b0c	Stamp copyrights for year 2011.	2011-01-01 13:18:15 -05:00
Tom Lane	f4e4b32743	Support RIGHT and FULL OUTER JOIN in hash joins. This is advantageous first because it allows us to hash the smaller table regardless of the outer-join type, and second because hash join can be more flexible than merge join in dealing with arbitrary join quals in a FULL join. For merge join all the join quals have to be mergejoinable, but hash join will work so long as there's at least one hashjoinable qual --- the others can be any condition. (This is true essentially because we don't keep per-inner-tuple match flags in merge join, while hash join can do so.) To do this, we need a has-it-been-matched flag for each tuple in the hashtable, not just one for the current outer tuple. The key idea that makes this practical is that we can store the match flag in the tuple's infomask, since there are lots of bits there that are of no interest for a MinimalTuple. So we aren't increasing the size of the hashtable at all for the feature. To write this without turning the hash code into even more of a pile of spaghetti than it already was, I rewrote ExecHashJoin in a state-machine style, similar to ExecMergeJoin. Other than that decision, it was pretty straightforward.	2010-12-30 20:26:08 -05:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Tom Lane	4e97631e6a	Fix join-removal logic for pseudoconstant and outerjoin-delayed quals. In these cases a qual can get marked with the removable rel in its required_relids, but this is just to schedule its evaluation correctly, not because it really depends on the rel. We were assuming that, in effect, we could throw away all quals so marked, which is nonsense. Tighten up the logic to be a little more paranoid about which quals belong to the outer join being considered for removal, and arrange for all quals that don't belong to be updated so they will still get evaluated correctly. Also fix another problem that happened to be exposed by this test case, which was that make_join_rel() was failing to notice some cases where a constant-false qual could be used to prove a join relation empty. If it's a pushed-down constant false, then the relation is empty even if it's an outer join, because the qual applies after the outer join expansion. Per report from Nathan Grange. Back-patch into 9.0.	2010-09-14 23:15:29 +00:00
Bruce Momjian	65e806cba1	pgindent run for 9.0	2010-02-26 02:01:40 +00:00
Bruce Momjian	0239800893	Update copyright for the year 2010.	2010-01-02 16:58:17 +00:00
Tom Lane	1a95f12702	Eliminate a lot of list-management overhead within join_search_one_level by adding a requirement that build_join_rel add new join RelOptInfos to the appropriate list immediately at creation. Per report from Robert Haas, the list_concat_unique_ptr() calls that this change eliminates were taking the lion's share of the runtime in larger join problems. This doesn't do anything to fix the fundamental combinatorial explosion in large join problems, but it should push out the threshold of pain a bit further. Note: because this changes the order in which joinrel lists are built, it might result in changes in selected plans in cases where different alternatives have exactly the same costs. There is one example in the regression tests.	2009-11-28 00:46:19 +00:00
Tom Lane	1ca695db38	Fix another thinko in join_is_legal's handling of semijoins: we have to test for the case that the semijoin was implemented within either input by unique-ifying its RHS before we test to see if it appears to match the current join situation. The previous coding would select semijoin logic in situations where we'd already unique-ified the RHS and joined it to some unrelated relation(s), and then came to join it to the semijoin's LHS. That still gave the right answer as far as the semijoin itself was concerned, but would lead to incorrectly examining only an arbitrary one of the matchable rows from the unrelated relation(s). The cause of this thinko was incorrect unification of the pre-8.4 logic for IN joins and OUTER joins --- the comparable case for outer joins can be handled after making the match test, but that's because there is nothing like the unique-ification escape hatch for outer joins. Per bug #4934 from Benjamin Reed.	2009-07-23 17:42:06 +00:00
Tom Lane	a43b190e3c	Fix a thinko in join_is_legal: when we decide we can implement a semijoin by unique-ifying the RHS and then inner-joining to some other relation, that is not grounds for violating the RHS of some other outer join. Noticed while regression-testing new GEQO code, which will blindly follow any path that join_is_legal says is legal, and then complain later if that leads to a dead end. I'm not certain that this can result in any visible failure in 8.4: the mistake may always be masked by the fact that subsequent attempts to join the rest of the RHS of the other join will fail. But I'm not certain it can't, either, and it's definitely not operating as intended. So back-patch. The added regression test depends on the new no-failures-allowed logic that I'm about to commit in GEQO, so no point back-patching that.	2009-07-19 20:32:48 +00:00
Bruce Momjian	d747140279	8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list provided by Andrew.	2009-06-11 14:49:15 +00:00
Tom Lane	75c85bd199	Tighten up join ordering rules to account for recent more-careful analysis of the associativity of antijoins. Also improve optimizer/README discussion of outer join ordering rules.	2009-02-27 22:41:38 +00:00
Tom Lane	233b8a99ad	Improve comments about semijoin implementation strategy, per a question from Robert Haas.	2009-02-19 20:32:45 +00:00
Bruce Momjian	511db38ace	Update copyright for 2009.	2009-01-01 17:24:05 +00:00
Tom Lane	213256cfa9	My recent fix for semijoin planning didn't actually work for a semijoin with a RHS that can't be unique-ified --- join_is_legal has to check that before deciding to build a join, else we'll have an unimplementable joinrel. Per report from Greg Stark.	2008-11-28 19:29:07 +00:00
Tom Lane	8309d006cb	Switch the planner over to treating qualifications of a JOIN_SEMI join as though it is an inner rather than outer join type. This essentially means that we don't bother to separate "pushed down" qual conditions from actual join quals at a semijoin plan node; which is okay because the restrictions of SQL syntax make it impossible to have a pushed-down qual that references the inner side of a semijoin. This allows noticeably better optimization of IN/EXISTS cases than we had before, since the equivalence-class machinery can now use those quals. Also fix a couple of other mistakes that had essentially disabled the ability to unique-ify the inner relation and then join it to just a subset of the left-hand relations. An example case using the regression database is select * from tenk1 a, tenk1 b where (a.unique1,b.unique2) in (select unique1,unique2 from tenk1 c); which is planned reasonably well by 8.3 and earlier but had been forcing a cartesian join of a/b in CVS HEAD.	2008-11-22 22:47:06 +00:00
Tom Lane	719012e013	Add some defenses against constant-FALSE outer join conditions. Since eval_const_expressions will generally throw away anything that's ANDed with constant FALSE, what we're left with given an example like select * from tenk1 a where (unique1,0) in (select unique2,1 from tenk1 b); is a cartesian product computation, which is really not acceptable. This is a regression in CVS HEAD compared to previous releases, which were able to notice the impossible join condition in this case --- though not in some related cases that are also improved by this patch, such as select * from tenk1 a left join tenk1 b on (a.unique1=b.unique2 and 0=1); Fix by skipping evaluation of the appropriate side of the outer join in cases where it's demonstrably unnecessary.	2008-08-17 19:40:11 +00:00
Tom Lane	e006a24ad1	Implement SEMI and ANTI joins in the planner and executor. (Semijoins replace the old JOIN_IN code, but antijoins are new functionality.) Teach the planner to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti joins respectively. Also, LEFT JOINs with suitable upper-level IS NULL filters are recognized as being anti joins. Unify the InClauseInfo and OuterJoinInfo infrastructure into "SpecialJoinInfo". With that change, it becomes possible to associate a SpecialJoinInfo with every join attempt, which permits some cleanup of join selectivity estimation. That needs to be taken much further than this patch does, but the next step is to change the API for oprjoin selectivity functions, which seems like material for a separate patch. So for the moment the output size estimates for semi and especially anti joins are quite bogus.	2008-08-14 18:48:00 +00:00
Tom Lane	fd791e7b5a	When a relation has been proven empty by constraint exclusion, propagate that knowledge up through any joins it participates in. We were doing that already in some special cases but not in the general case. Also, defend against zero row estimates for the input relations in cost_mergejoin --- this fix may have eliminated the only scenario in which that can happen, but be safe. Per report from Alex Solovey.	2008-03-24 21:53:04 +00:00
Tom Lane	59fc64acee	Fix a conceptual error in my patch of 2007-10-26 that avoided considering clauseless joins of relations that have unexploited join clauses. Rather than looking at every other base relation in the query, the correct thing is to examine the other relations in the "initial_rels" list of the current make_rel_from_joinlist() invocation, because those are what we actually have the ability to join against. This might be a subset of the whole query in cases where join_collapse_limit or from_collapse_limit or full joins have prevented merging the whole query into a single join problem. This is a bit untidy because we have to pass those rels down through a new PlannerInfo field, but it's necessary. Per bug #3865 from Oleg Kharin.	2008-01-11 04:02:18 +00:00
Bruce Momjian	9098ab9e32	Update copyrights in source tree to 2008.	2008-01-01 19:46:01 +00:00
Bruce Momjian	fdf5a5efb7	pgindent run for 8.3.	2007-11-15 21:14:46 +00:00
Tom Lane	cd2a2ce904	Change have_join_order_restriction() so that we do not force a clauseless join if either of the input relations can legally be joined to any other rels using join clauses. This avoids uselessly (and expensively) considering a lot of really stupid join paths when there is a join restriction with a large footprint, that is, lots of relations inside its LHS or RHS. My patch of 15-Feb-2007 had been causing the code to consider joining every combination of rels inside such a group, which is exponentially bad :-(. With this behavior, clauseless bushy joins will be done if necessary, but they'll be put off as long as possible. Per report from Jakub Ouhrabka. Backpatch to 8.2. We might someday want to backpatch to 8.1 as well, but 8.1 does not have the problem for OUTER JOIN nests, only for IN-clauses, so it's not clear anyone's very likely to hit it in practice; and the current patch doesn't apply cleanly to 8.1.	2007-10-26 18:10:50 +00:00

1 2 3

137 commits