Commit graph

798 commits

Author SHA1 Message Date
Tom Lane
5d46ee4a4a Do not allow Unique nodes to be scanned backwards. The code claimed that it
would work, but in fact it didn't return the same rows when moving backwards
as when moving forwards.  This would have no visible effect in a DISTINCT
query (at least assuming the column datatypes use a strong definition of
equality), but it gave entirely wrong answers for DISTINCT ON queries.
2008-08-05 21:28:48 +00:00
Tom Lane
23811025a0 Fix mis-calculation of extParam/allParam sets for plan nodes, as seen in
bug #4290.  The fundamental bug is that masking extParam by outer_params,
as finalize_plan had been doing, caused us to lose the information that
an initPlan depended on the output of a sibling initPlan.  On reflection
the best thing to do seemed to be not to try to adjust outer_params for
this case but get rid of it entirely.  The only thing it was really doing
for us was to filter out param IDs associated with SubPlan nodes, and that
can be done (with greater accuracy) while processing individual SubPlan
nodes in finalize_primnode.  This approach was vindicated by the discovery
that the masking method was hiding a second bug: SS_finalize_plan failed to
remove extParam bits for initPlan output params that were referenced in the
main plan tree (it only got rid of those referenced by other initPlans).
It's not clear that this caused any real problems, given the limited use
of extParam by the executor, but it's certainly not what was intended.

I originally thought that there was also a problem with needing to include
indirect dependencies on external params in initPlans' param sets, but it
turns out that the executor handles this correctly so long as the depended-on
initPlan is earlier in the initPlans list than the one using its output.
That seems a bit of a fragile assumption, but it is true at the moment,
so I just documented it in some code comments rather than making what would
be rather invasive changes to remove the assumption.

Back-patch to 8.1.  Previous versions don't have the case of initPlans
referring to other initPlans' outputs, so while the existing logic is still
questionable for them, there are not any known bugs to be fixed.  So I'll
refrain from changing them for now.
2008-07-10 01:17:52 +00:00
Tom Lane
d8cb5391c2 Repair problems occurring when multiple RI updates have to be done to the same
row within one query: we were firing check triggers before all the updates
were done, leading to bogus failures.  Fix by making the triggers queued by
an RI update go at the end of the outer query's trigger event list, thereby
effectively making the processing "breadth-first".  This was indeed how it
worked pre-8.0, so the bug does not occur in the 7.x branches.
Per report from Pavel Stehule.
2007-08-15 19:16:04 +00:00
Neil Conway
1de589bfcb Fix a gradual memory leak in ExecReScanAgg(). Because the aggregation
hash table is allocated in a child context of the agg node's memory
context, MemoryContextReset() will reset but *not* delete the child
context. Since ExecReScanAgg() proceeds to build a new hash table
from scratch (in a new sub-context), this results in leaking the
header for the previous memory context. Therefore, use
MemoryContextResetAndDeleteChildren() instead.

Credit: My colleague Sailesh Krishnamurthy at Truviso for isolating
the cause of the leak.
2007-08-08 18:07:02 +00:00
Tom Lane
38bbcb3888 Fix performance problems in multi-batch hash joins by ensuring that we select
a well-randomized batch number even when given a poorly-randomized hash value.
This is a bit inefficient but seems the only practical solution given the
constraint that we can't change the hash functions in released branches.
Per report from Joseph Shraibman.

Applied to 8.1 and 8.2 only --- HEAD is getting a cleaner fix, and 8.0 and
before use different coding that seems less vulnerable.
2007-06-01 15:58:09 +00:00
Tom Lane
ca27e5ec4c Fix dynahash.c to suppress hash bucket splits while a hash_seq_search() scan
is in progress on the same hashtable.  This seems the least invasive way to
fix the recently-recognized problem that a split could cause the scan to
visit entries twice or (with much lower probability) miss them entirely.
The only field-reported problem caused by this is the "failed to re-find
shared lock object" PANIC in COMMIT PREPARED reported by Michel Dorochevsky,
which was caused by multiply visited entries.  However, it seems certain
that mdsync() is vulnerable to missing required fsync's due to missed
entries, and I am fearful that RelationCacheInitializePhase2() might be at
risk as well.  Because of that and the generalized hazard presented by this
bug, back-patch all the supported branches.

Along the way, fix pg_prepared_statement() and pg_cursor() to not assume
that the hashtables they are examining will stay static between calls.
This is risky regardless of the newly noted dynahash problem, because
hash_seq_search() has never promised to cope with deletion of table entries
other than the just-returned one.  There may be no bug here because the only
supported way to call these functions is via ExecMakeTableFunctionResult()
which will cycle them to completion before doing anything very interesting,
but it seems best to get rid of the assumption.  This affects 8.2 and HEAD
only, since those functions weren't there earlier.
2007-04-26 23:25:09 +00:00
Tom Lane
40d7f86074 Fix check_sql_fn_retval to allow the case where a SQL function declared to
return void ends with a SELECT, if that SELECT has a single result that is
also of type void.  Without this, it's hard to write a void function that
calls another void function.  Per gripe from Peter.

Back-patch as far as 8.0.
2007-04-02 18:49:41 +00:00
Tom Lane
6c15622595 SPI_cursor_open failed to enforce that only read-only queries could be
executed in read_only mode.  This could lead to various relatively-subtle
failures, such as an allegedly stable function returning non-stable results.
Bug goes all the way back to the introduction of read-only mode in 8.0.
Per report from Gaetano Mendola.
2007-03-17 03:15:55 +00:00
Tom Lane
c60125a9be Remove typmod checking from the recent security-related patches. It turns
out that ExecEvalVar and friends don't necessarily have access to a tuple
descriptor with correct typmod: it definitely can contain -1, and possibly
might contain other values that are different from the Var's value.
Arguably this should be cleaned up someday, but it's not a simple change,
and in any case typmod discrepancies don't pose a security hazard.
Per reports from numerous people :-(

I'm not entirely sure whether the failure can occur in 8.0 --- the simple
test cases reported so far don't trigger it there.  But back-patch the
change all the way anyway.
2007-02-06 17:35:34 +00:00
Tom Lane
1f1f5efa82 Repair failure to check that a table is still compatible with a previously
made query plan.  Use of ALTER COLUMN TYPE creates a hazard for cached
query plans: they could contain Vars that claim a column has a different
type than it now has.  Fix this by checking during plan startup that Vars
at relation scan level match the current relation tuple descriptor.  Since
at that point we already have at least AccessShareLock, we can be sure the
column type will not change underneath us later in the query.  However,
since a backend's locks do not conflict against itself, there is still a
hole for an attacker to exploit: he could try to execute ALTER COLUMN TYPE
while a query is in progress in the current backend.  Seal that hole by
rejecting ALTER TABLE whenever the target relation is already open in
the current backend.

This is a significant security hole: not only can one trivially crash the
backend, but with appropriate misuse of pass-by-reference datatypes it is
possible to read out arbitrary locations in the server process's memory,
which could allow retrieving database content the user should not be able
to see.  Our thanks to Jeff Trout for the initial report.

Security: CVE-2007-0556
2007-02-02 00:07:44 +00:00
Tom Lane
088ef257fe Repair insufficiently careful type checking for SQL-language functions:
we should check that the function code returns the claimed result datatype
every time we parse the function for execution.  Formerly, for simple
scalar result types we assumed the creation-time check was sufficient, but
this fails if the function selects from a table that's been redefined since
then, and even more obviously fails if check_function_bodies had been OFF.

This is a significant security hole: not only can one trivially crash the
backend, but with appropriate misuse of pass-by-reference datatypes it is
possible to read out arbitrary locations in the server process's memory,
which could allow retrieving database content the user should not be able
to see.  Our thanks to Jeff Trout for the initial report.

Security: CVE-2007-0555
2007-02-02 00:03:30 +00:00
Tom Lane
212df03ac9 Relax an Assert() that has been found to be too strict in some situations
involving unions of types having typmods.  Variants of the failure are known
to occur in 8.1 and up; not sure if it's possible in 8.0 and 7.4, but since
the code exists that far back, I'll just patch 'em all.  Per report from
Brian Hurt.
2007-01-24 01:25:56 +00:00
Tom Lane
f1d8828e3c Repair bug #2839: the various ExecReScan functions need to reset
ps_TupFromTlist in plan nodes that make use of it.  This was being done
correctly in join nodes and Result nodes but not in any relation-scan nodes.
Bug would lead to bogus results if a set-returning function appeared in the
targetlist of a subquery that could be rescanned after partial execution,
for example a subquery within EXISTS().  Bug has been around forever :-(
... surprising it wasn't reported before.
2006-12-26 19:27:04 +00:00
Tom Lane
b3234f2912 Repair bug #2694 concerning an ARRAY[] construct whose inputs are empty
sub-arrays.  Per discussion, if all inputs are empty arrays then result
must be an empty array too, whereas a mix of empty and nonempty arrays
should (and already did) draw an error.  In the back branches, the
construct was strict: any NULL input immediately yielded a NULL output;
so I left that behavior alone.  HEAD was simply ignoring NULL sub-arrays,
which doesn't seem very sensible.  For lack of a better idea it now
treats NULL sub-arrays the same as empty ones.
2006-11-06 18:21:38 +00:00
Tom Lane
fb27f43123 Fix mishandling of after-trigger state when a SQL function returns multiple
rows --- if the surrounding query queued any trigger events between the rows,
the events would be fired at the wrong time, leading to bizarre behavior.
Per report from Merlin Moncure.

This is a simple patch that should solve the problem fully in the back
branches, but in HEAD we also need to consider the possibility of queries
with RETURNING clauses.  Will look into a fix for that separately.
2006-10-12 17:02:28 +00:00
Bruce Momjian
96cc1341dd Fix SELECT INTO and CREATE TABLE AS to create tables in the default
tablespace, not the base directory.

Kris Jurka
2006-04-26 23:01:58 +00:00
Tom Lane
8bf221b09b Fix bug introduced into mergejoin logic by performance improvement patch of
2005-05-13.  When we find that a new inner tuple can't possibly match any
outer tuple (because it contains a NULL), we can't immediately skip the
tuple when we are in NEXTINNER state.  Doing so can lead to emitting
multiple copies of the tuple in FillInner mode, because we may rescan the
tuple after returning to a previous marked tuple.  Instead, proceed to
NEXTOUTER state the same as we used to do.  After we've found that there's
no need to return to the marked position, we can go to SKIPINNER_ADVANCE
state instead of SKIP_TEST when the inner tuple is unmatchable; this
preserves the performance improvement.  Per bug report from Bruce.
I also made a couple of cosmetic code rearrangements and added a regression
test for the problem.
2006-03-17 19:38:21 +00:00
Tom Lane
98ab0c96dd Add a CHECK_FOR_INTERRUPTS() to the loop in ExecMakeTableFunctionResult.
Otherwise you can't cancel queries like select ... from generate_series(1,1000000).
2006-03-10 01:51:34 +00:00
Tom Lane
a933c4e59e Repair "Halloween problem" in EvalPlanQual: a tuple that's been inserted by
our own command (or more generally, xmin = our xact and cmin >= current
command ID) should not be seen as good.  Else we may try to update rows
we already updated.  This error was inserted last August while fixing the
even bigger problem that the old coding wouldn't see *any* tuples inserted
by our own transaction as good.  Per report from Euler Taveira de Oliveira.
2006-01-12 21:49:07 +00:00
Tom Lane
599e225c11 Fix problem with whole-row Vars referencing sub-select outputs, per
example from Jim Dew.  Add some simple regression tests, since this is
an area we seem to break regularly :-(
2005-12-14 16:28:49 +00:00
Tom Lane
2913b95308 Rearrange code in ExecInitBitmapHeapScan so that we don't initialize the
child plan nodes until we have acquired lock on the relation to scan.
The relative order of initialization of plan nodes isn't real important in
other cases, but it's critical here because one is supposed to lock a
relation before its indexes, not vice versa.  The original coding was at
least vulnerable to deadlock against DROP INDEX, and perhaps worse things.
2005-12-02 01:30:26 +00:00
Tom Lane
c4279b45fd Tweak hash join code to use an additional heuristic for deciding whether
it's worth probing the outer relation for emptiness before building the
hash table.  To wit, if we're rescanning a join previously performed,
remember whether we found it nonempty the previous time, and don't bother
with the probe if it was nonempty.  This buys back the performance lost
in examples like Mario Weilguni's.
2005-11-28 23:46:25 +00:00
Tom Lane
7e3cdc631e Recent changes to allow hash join to exit early given empty input from
one child or the other had a problem: they did not leave the node in a
state that ExecReScanHashJoin would understand.  In particular it would
tend to fail to reset the child plans when needed.  Per report from
Mario Weilguni.
2005-11-28 17:14:47 +00:00
Tom Lane
fd3ac58b68 Get rid of ExecAssignResultTypeFromOuterPlan() and make all plan node types
generate their output tuple descriptors from their target lists (ie, using
ExecAssignResultTypeFromTL()).  We long ago fixed things so that all node
types have minimally valid tlists, so there's no longer any good reason to
have two different ways of doing it.  This change is needed to fix bug
reported by Hayden James: the fix of 2005-11-03 to emit the correct column
names after optimizing away a SubqueryScan node didn't work if the new
top-level plan node used ExecAssignResultTypeFromOuterPlan to generate its
tupdesc, since the next plan node down won't have the correct column labels.
2005-11-23 20:28:05 +00:00
Bruce Momjian
bef7764835 Re-run pgindent, fixing a problem where comment lines after a blank
comment line where output as too long, and update typedefs for /lib
directory.  Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).

Backpatch to 8.1.X.
2005-11-22 18:23:31 +00:00
Tom Lane
c8de36352f Modify tuptoaster's API so that it does not try to modify the passed
tuple in-place, but instead passes back an all-new tuple structure if
any changes are needed.  This is a much cleaner and more robust solution
for the bug discovered by Alexey Beschiokov; accordingly, revert the
quick hack I installed yesterday.
With this change, HeapTupleData.t_datamcxt is no longer needed; will
remove it in a separate commit in HEAD only.
2005-11-20 18:38:42 +00:00
Tom Lane
efd2ae8f19 Stopgap solution for problem reported by Alexey Beschiokov: after
doing heap_insert or heap_update, wipe out any extracted fields in
the TupleTableSlot containing the tuple, because they might not be valid
anymore if tuptoaster.c changed the tuple.  Safe because slot must be
in the materialized state, but mighty ugly --- find a better answer!
2005-11-19 20:58:42 +00:00
Tom Lane
431178ae67 Prevent ExecInsert() and ExecUpdate() from scribbling on the result tuple
slot of the topmost plan node when a trigger returns a modified tuple.
These appear to be the only places where a plan node's caller did not
treat the result slot as read-only, which is an assumption that nodeUnique
makes as of 8.1.  Fixes trigger-vs-DISTINCT bug reported by Frank van Vugt.
2005-11-14 17:43:13 +00:00
Alvaro Herrera
902377c465 Rename the members of CommandDest enum so they don't collide with other uses of
those names.  (Debug and None were pretty bad names anyway.)  I hope I catched
all uses of the names in comments too.
2005-11-03 17:11:40 +00:00
Tom Lane
d9cb48786e Better solution to the problem of labeling whole-row Datums that are
generated from subquery outputs: use the type info stored in the Var
itself.  To avoid making ExecEvalVar and slot_getattr more complex
and slower, I split out the whole-row case into a separate ExecEval routine.
2005-10-19 22:30:30 +00:00
Tom Lane
07908c9c37 Ensure that the Datum generated from a whole-row Var contains valid
type ID information even when it's a record type.  This is needed to
handle whole-row Vars referencing subquery outputs.  Per example from
Richard Huxton.
2005-10-19 18:18:33 +00:00
Tom Lane
23836fb1fb A few trivial code cleanups motivated by reading warnings generated
by a recent HP C compiler.  Mostly, get rid of useless local variables
that are assigned to but never used.
2005-10-18 01:06:24 +00:00
Bruce Momjian
1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane
cb8b6618ce Revise pgstats stuff to fix the problems with not counting accesses
generated by bitmap index scans.  Along the way, simplify and speed up
the code for counting sequential and index scans; it was both confusing
and inefficient to be taking care of that in the per-tuple loops, IMHO.
initdb forced because of internal changes in pg_stat view definitions.
2005-10-06 02:29:23 +00:00
Tom Lane
1b61ee3c69 _SPI_execute_plan failed to return result tuple table to caller in
the ProcessUtility case, resulting in an intratransaction memory leak
if a utility command actually did return any tuples, as reported by
Dmitry Karasik.  Fix this and also make the behavior more consistent
for cases involving nested SPI operations and multiple query trees,
by ensuring that we store the state locally until it is ready to be
returned to the caller.
2005-10-01 18:43:19 +00:00
Tom Lane
e990b9ce23 The original patch to avoid building a hash join's hashtable when the
outer relation is empty did not work, per test case from Patrick Welche.
It tried to use nodeHashjoin.c's high-level mechanisms for fetching an
outer-relation tuple, but that code expected the hash table to be filled
already.  As patched, the code failed in corner cases such as having no
outer-relation tuples for the first hash batch.  Revert and rewrite.
2005-09-25 19:37:35 +00:00
Tom Lane
d7bb412e9c Remove some dead code. 2005-09-22 15:09:51 +00:00
Tom Lane
46a0eee300 Tweak nodeBitmapAnd to stop evaluating sub-plan scans if it finds it's
got an empty bitmap after any step; the remaining subplans can no longer
affect the result.  Per a suggestion from Ilia Kantor.
2005-08-28 22:47:20 +00:00
Tom Lane
f26b91761b Arrange for indexes and toast tables to inherit their ownership from
the parent table, even if the command that creates them is executed by
someone else (such as a superuser or a member of the owning role).
Per gripe from Michael Fuhr.
2005-08-26 03:08:15 +00:00
Tom Lane
f57e3f4cf3 Repair problems with VACUUM destroying t_ctid chains too soon, and with
insufficient paranoia in code that follows t_ctid links.  (We must do both
because even with VACUUM doing it properly, the intermediate state with
a dangling t_ctid link is visible concurrently during lazy VACUUM, and
could be seen afterwards if either type of VACUUM crashes partway through.)
Also try to improve documentation about what's going on.  Patch is a bit
bulky because passing the XMAX information around required changing the
APIs of some low-level heapam.c routines, but it's not conceptually very
complicated.  Per trouble report from Teodor and subsequent analysis.
This needs to be back-patched, but I'll do that after 8.1 beta is out.
2005-08-20 00:40:32 +00:00
Tom Lane
77b4bd3b43 Update some obsolete comments --- code is using t_self now, not t_ctid. 2005-08-18 21:34:20 +00:00
Tom Lane
2a4fad1a0e Add NOWAIT option to SELECT FOR UPDATE/SHARE.
Original patch by Hans-Juergen Schoenig, revisions by Karel Zak
and Tom Lane.
2005-08-01 20:31:16 +00:00
Tom Lane
7762619e95 Replace pg_shadow and pg_group by new role-capable catalogs pg_authid
and pg_auth_members.  There are still many loose ends to finish in this
patch (no documentation, no regression tests, no pg_dump support for
instance).  But I'm going to commit it now anyway so that Alvaro can
make some progress on shared dependencies.  The catalog changes should
be pretty much done.
2005-06-28 05:09:14 +00:00
Tom Lane
943b396245 Add Oracle-compatible GREATEST and LEAST functions. Pavel Stehule 2005-06-26 22:05:42 +00:00
Tom Lane
b95ae32b41 Avoid WAL-logging individual tuple insertions during CREATE TABLE AS
(a/k/a SELECT INTO).  Instead, flush and fsync the whole relation before
committing.  We do still need the WAL log when PITR is active, however.
Simon Riggs and Tom Lane.
2005-06-20 18:37:02 +00:00
Neil Conway
c119c5bd49 Change the implementation of hash join to attempt to avoid unnecessary
work if either of the join relations are empty. The logic is:

(1) if the inner relation's startup cost is less than the outer
    relation's startup cost and this is not an outer join, read
    a single tuple from the inner relation via ExecHash()
      - if NULL, we're done

(2) read a single tuple from the outer relation
      - if NULL, we're done

(3) build the hash table on the inner relation
      - if hash table is empty and this is not an outer join,
        we're done

(4) otherwise, do hash join as usual

The implementation uses the new MultiExecProcNode API, per a
suggestion from Tom: invoking ExecHash() now produces the first
tuple from the Hash node's child node, whereas MultiExecHash()
builds the hash table.

I had to put in a bit of a kludge to get the row count returned
for EXPLAIN ANALYZE to be correct: since ExecHash() is invoked to
return a tuple, and then MultiExecHash() is invoked, we would
return one too many tuples to EXPLAIN ANALYZE. I hacked around
this by just manually detecting this situation and subtracting 1
from the EXPLAIN ANALYZE row count.
2005-06-15 07:27:44 +00:00
Tom Lane
56b01dc9ff Make SPI set SPI_processed for CREATE TABLE AS / SELECT INTO commands;
this in turn causes CREATE TABLE AS in plpgsql to set ROW_COUNT.
This is how it behaved before 7.4; I had unintentionally changed the
behavior in a bit of sloppy micro-optimization.
2005-06-09 21:25:22 +00:00
Tom Lane
e92a88272e Modify hash_search() API to prevent future occurrences of the error
spotted by Qingqing Zhou.  The HASH_ENTER action now automatically
fails with elog(ERROR) on out-of-memory --- which incidentally lets
us eliminate duplicate error checks in quite a bunch of places.  If
you really need the old return-NULL-on-out-of-memory behavior, you
can ask for HASH_ENTER_NULL.  But there is now an Assert in that path
checking that you aren't hoping to get that behavior in a palloc-based
hash table.
Along the way, remove the old HASH_FIND_SAVE/HASH_REMOVE_SAVED actions,
which were not being used anywhere anymore, and were surely too ugly
and unsafe to want to see revived again.
2005-05-29 04:23:07 +00:00
Tom Lane
e2159f3842 Teach the planner to remove SubqueryScan nodes from the plan if they
aren't doing anything useful (ie, neither selection nor projection).
Also, extend to SubqueryScan the hacks already in place to avoid
unnecessary ExecProject calls when the result would just be the same
tuple the subquery already delivered.  This saves some overhead in
UNION and other set operations, as well as avoiding overhead for
unflatten-able subqueries.  Per example from Sokolov Yura.
2005-05-22 22:30:20 +00:00
Tom Lane
2ef172a2a4 Fix latent bug in ExecSeqRestrPos: it leaves the plan node's result slot
in an inconsistent state.  (This is only latent because in reality
ExecSeqRestrPos is dead code at the moment ... but someday maybe it won't
be.)  Add some comments about what the API for plan node mark/restore
actually is, because it's not immediately obvious.
2005-05-15 21:19:55 +00:00