keeping private state in each backend that has inserted and deleted the same
tuple during its current top-level transaction. This is sufficient since
there is no need to be able to determine the cmin/cmax from any other
transaction. This gets us back down to 23-byte headers, removing a penalty
paid in 8.0 to support subtransactions. Patch by Heikki Linnakangas, with
minor revisions by moi, following a design hashed out awhile back on the
pghackers list.
> Currently, an index split writes all the data on the split page to
> WAL. That's a lot of WAL traffic. The tuples that are copied to the
> right page need to be WAL logged, but the tuples that stay on the
> original page don't.
Heikki Linnakangas
already collected in the current transaction; this allows plpgsql functions to
watch for stats updates even though they are confined to a single transaction.
Use this instead of the previous kluge involving pg_stat_file() to wait for
the stats collector to update in the stats regression test. Internally,
decouple storage of stats snapshots from transaction boundaries; they'll
now stick around until someone calls pgstat_clear_snapshot --- which xact.c
still does at transaction end, to maintain the previous behavior. This makes
the logic a lot cleaner, at the price of a couple dozen cycles per transaction
exit.
that aren't turned into true joins). Since this is the last missing bit of
infrastructure, go ahead and fill out the hash integer_ops and float_ops
opfamilies with cross-type operators. The operator family project is now
DONE ... er, except for documentation ...
describe the maximum size of index tuples (which is typically AM-dependent
anyway); and consequently remove the bogus deduction for "special space"
that was built into it.
Adjust TOAST_TUPLE_THRESHOLD and TOAST_MAX_CHUNK_SIZE to avoid wasting two
bytes per toast chunk, and to ensure that the calculation correctly tracks any
future changes in page header size. The computation had been inaccurate in a
way that didn't cause any harm except space wastage, but future changes could
have broken it more drastically.
Fix the calculation of BTMaxItemSize, which was formerly computed as 1 byte
more than it could safely be. This didn't cause any harm in practice because
it's only compared against maxalign'd lengths, but future changes in the size
of page headers or btree special space could have exposed the problem.
initdb forced because of change in TOAST_MAX_CHUNK_SIZE, which alters the
storage of toast tables.
threshold for tuple length. On 4-byte-MAXALIGN machines, the toast code
creates tuples that have t_len exactly TOAST_TUPLE_THRESHOLD ... but this
number is not itself maxaligned, so if heap_insert maxaligns t_len before
comparing to TOAST_TUPLE_THRESHOLD, it'll uselessly recurse back to
tuptoaster.c, wasting cycles. (It turns out that this does not happen on
8-byte-MAXALIGN machines, because for them the outer MAXALIGN in the
TOAST_MAX_CHUNK_SIZE macro reduces TOAST_MAX_CHUNK_SIZE so that toast tuples
will be less than TOAST_TUPLE_THRESHOLD in size. That MAXALIGN is really
incorrect, but we can't remove it now, see below.) There isn't any particular
value in maxaligning before comparing to the thresholds, so just don't do
that, which saves a small number of cycles in itself.
These numbers should be rejiggered to minimize wasted space on toast-relation
pages, but we can't do that in the back branches because changing
TOAST_MAX_CHUNK_SIZE would force an initdb (by changing the contents of toast
tables). We can move the toast decision thresholds a bit, though, which is
what this patch effectively does.
Thanks to Pavan Deolasee for discovering the unintended recursion.
Back-patch into 8.2, but not further, pending more testing. (HEAD is about
to get a further patch modifying the thresholds, so it won't help much
for testing this form of the patch.)
observe the xmloption.
Reorganize the representation of the XML option in the parse tree and the
API to make it easier to manage and understand.
Add regression tests for parsing back XML expressions.
made query plan. Use of ALTER COLUMN TYPE creates a hazard for cached
query plans: they could contain Vars that claim a column has a different
type than it now has. Fix this by checking during plan startup that Vars
at relation scan level match the current relation tuple descriptor. Since
at that point we already have at least AccessShareLock, we can be sure the
column type will not change underneath us later in the query. However,
since a backend's locks do not conflict against itself, there is still a
hole for an attacker to exploit: he could try to execute ALTER COLUMN TYPE
while a query is in progress in the current backend. Seal that hole by
rejecting ALTER TABLE whenever the target relation is already open in
the current backend.
This is a significant security hole: not only can one trivially crash the
backend, but with appropriate misuse of pass-by-reference datatypes it is
possible to read out arbitrary locations in the server process's memory,
which could allow retrieving database content the user should not be able
to see. Our thanks to Jeff Trout for the initial report.
Security: CVE-2007-0556
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
nonportable "hh" sprintf(3) length modifier. Instead, do the parsing
and output by hand. The code to do this isn't ideal, but this is
an interim measure anyway: the uuid type should probably use the
in-memory struct layout specified by RFC 4122. For now, this patch
should hopefully rectify the buildfarm failures for the uuid test.
Along the way, re-add pg_cast entries for uuid <-> varchar, which
I mistakenly removed earlier, and bump the catversion.
In this case extractQuery should returns -1 as nentries. This changes
prototype of extractQuery method to use int32* instead of uint32* for
nentries argument.
Based on that gincostestimate may see two corner cases: nothing will be found
or seqscan should be used.
Per proposal at http://archives.postgresql.org/pgsql-hackers/2007-01/msg01581.php
PS tsearch_core patch should be sightly modified to support changes, but I'm
waiting a verdict about reviewing of tsearch_core patch.
Hashing for aggregation purposes still needs work, so it's not time to
mark any cross-type operators as hashable for general use, but these cases
work if the operators are so marked by hand in the system catalogs.
match because they contain a null join key (and the join operator is
known strict). Improves performance significantly when the inner
relation contains a lot of nulls, as per bug #2930.
that defined in RFC 4122. This patch includes the basic implementation,
plus regression tests. Documentation and perhaps some additional
functionality will come later. Catversion bumped.
Patch from Gevik Babakhani; review from Peter, Tom, and myself.
input in the stats collector. Our select() emulation is apparently buggy
for UDP sockets :-(. This should resolve problems with stats collection
(and hence autovacuum) failing under more than minimal load. Diagnosis
and patch by Magnus Hagander.
Patch probably needs to be back-ported to 8.1 and 8.0, but first let's
see if it makes the buildfarm happy...
suffix, to distinguish them from doubles. Make some function declarations
and definitions use the "const" qualifier for arguments consistently.
Ignore warning 4102 ("unreferenced label"), because such warnings
are always emitted by bison-generated code. Patch from Magnus Hagander.
- Add new SQL command SET XML OPTION (also available via regular GUC) to
control the DOCUMENT vs. CONTENT option in implicit parsing and
serialization operations.
- Subtle corrections in the handling of the standalone property in
xmlroot().
- Allow xmlroot() to work on content fragments.
- Subtle corrections in the handling of the version property in
xmlconcat().
- Code refactoring for producing XML declarations.
FAMILY; and add FAMILY option to CREATE OPERATOR CLASS to allow adding a
class to a pre-existing family. Per previous discussion. Man, what a
tedious lot of cutting and pasting ...
which I had removed in the first cut of the EquivalenceClass rewrite to
simplify that patch a little. But it's still important --- in a four-way
join problem mergejoinscansel() was eating about 40% of the planning time
according to gprof. Also, improve the EquivalenceClass code to re-use
join RestrictInfos rather than generating fresh ones for each join
considered. This saves some memory space but more importantly improves
the effectiveness of caching planning info in RestrictInfos.
columns procost and prorows, to allow simple user adjustment of the estimated
cost of a function call, as well as control of the estimated number of rows
returned by a set-returning function. We might eventually wish to extend this
to allow function-specific estimation routines, but there seems to be
consensus that we should try a simple constant estimate first. In particular
this provides a relatively simple way to control the order in which different
WHERE clauses are applied in a plan node, which is a Good Thing in view of the
fact that the recent EquivalenceClass planner rewrite made that much less
predictable than before.
provide just a boolean 'amcanorder', instead of fields that specify the
sort operator strategy numbers. We have decided to require ordering-capable
AMs to use btree-compatible strategy numbers, so the old fields are
overkill (and indeed misleading about what's allowed).
representation of equivalence classes of variables. This is an extensive
rewrite, but it brings a number of benefits:
* planner no longer fails in the presence of "incomplete" operator families
that don't offer operators for every possible combination of datatypes.
* avoid generating and then discarding redundant equality clauses.
* remove bogus assumption that derived equalities always use operators
named "=".
* mergejoins can work with a variety of sort orders (e.g., descending) now,
instead of tying each mergejoinable operator to exactly one sort order.
* better recognition of redundant sort columns.
* can make use of equalities appearing underneath an outer join.
currentMarkData from IndexScanDesc to the opaque structs for the
AMs that need this information (currently gist and hash).
Patch from Heikki Linnakangas, fixes by Neil Conway.
the generated files, to help Visual C++ to run these tests. The tests still
pass in VPATH and normal builds.
Patch from Magnus Hagander, editorialized by me.
The implementation is somewhat ugly logic-wise, but I don't see an
easy way to make it more concise.
When writing this, I noticed that my previous implementation of
width_bucket() doesn't handle NaN correctly:
postgres=# select width_bucket('NaN', 1, 5, 5);
width_bucket
--------------
6
(1 row)
AFAICS SQL:2003 does not define a NaN value, so it doesn't address how
width_bucket() should behave here. The patch changes width_bucket() so
that ereport(ERROR) is raised if NaN is specified for the operand or the
lower or upper bounds to width_bucket(). For float8, NaN is disallowed
for any of the floating-point inputs, and +/- infinity is disallowed
for the histogram bounds (but allowed for the operand).
Update docs and regression tests, bump the catversion.
accessing it, like DROP DATABASE. This allows the regression tests to pass
with autovacuum enabled, which open the gates for finally enabling autovacuum
by default.