in the zic database or zone names found in the date token table. This
preserves the old ability to do AT TIME ZONE 'PST' along with the new
ability to do AT TIME ZONE 'PST8PDT'. Per gripe from Bricklen Anderson.
Also, fix some inconsistencies in usage of TZ_STRLEN_MAX --- the old
code had the potential for one-byte buffer overruns, though given
alignment considerations it's unlikely there was any real risk.
for procedural languages. This replaces the hard-wired table I had
originally proposed as a stopgap solution. For the moment, the initial
contents only include languages shipped with the core distribution.
on a page, as suggested by ITAGAKI Takahiro. Also, change a few places
that were using some other estimates of max-items-per-page to consistently
use MaxOffsetNumber. This is conservatively large --- we could have used
the new MaxHeapTuplesPerPage macro, or a similar one for index tuples ---
but those places are simply declaring a fixed-size buffer and assuming it
will work, rather than actively testing for overrun. It seems safer to
size these buffers in a way that can't overflow even if the page is
corrupt.
so that the latter estimates the number of groups that grouping will
produce. This is needed because it is primarily query_planner that
makes the decision between fast-start and fast-finish plans, and in the
original coding it was unable to make more than a crude rule-of-thumb
choice when the query involved grouping. This revision helps us make
saner choices for queries like SELECT ... GROUP BY ... LIMIT, as in a
recent example from Mark Kirkwood. Also move the responsibility for
canonicalizing sort_pathkeys and group_pathkeys into query_planner;
this information has to be available anyway to support the first change,
and doing it this way lets us get rid of compare_noncanonical_pathkeys
entirely.
the parent table, even if the command that creates them is executed by
someone else (such as a superuser or a member of the owning role).
Per gripe from Michael Fuhr.
use these instead of its previous hack of changing pg_class.reltriggers.
Documentation is lacking, will add that later.
Patch by Satoshi Nagayasu, review and some extra work by Tom Lane.
to 'Size' (that is, size_t), and install overflow detection checks in it.
This allows us to remove the former arbitrary restrictions on NBuffers
etc. It won't make any difference in a 32-bit machine, but in a 64-bit
machine you could theoretically have terabytes of shared buffers.
(How efficiently we could manage 'em remains to be seen.) Similarly,
num_temp_buffers, work_mem, and maintenance_work_mem can be set above
2Gb on a 64-bit machine. Original patch from Koichi Suzuki, additional
work by moi.
insufficient paranoia in code that follows t_ctid links. (We must do both
because even with VACUUM doing it properly, the intermediate state with
a dangling t_ctid link is visible concurrently during lazy VACUUM, and
could be seen afterwards if either type of VACUUM crashes partway through.)
Also try to improve documentation about what's going on. Patch is a bit
bulky because passing the XMAX information around required changing the
APIs of some low-level heapam.c routines, but it's not conceptually very
complicated. Per trouble report from Teodor and subsequent analysis.
This needs to be back-patched, but I'll do that after 8.1 beta is out.
or OFFSET clauses by using estimate_expression_value(). The main advantage
of this is that if the expression is a Param and we have a value for the
Param, we'll use that value rather than defaulting. Also, fix some
thinkos in the logic for combining LIMIT/OFFSET with an externally
supplied tuple fraction (this covers cases like EXISTS(...LIMIT...)).
And make sure the results of all this are shown by EXPLAIN. Per a
gripe from Merlin Moncure.
(the stats system has always collected this info, but the views were
filtering it out). Modify autovacuum so that over-threshold activity
in a toast table can trigger a VACUUM of the parent table, even if the
parent didn't appear to need vacuuming itself. Per discussion a month
or so back about "short, wide tables".
SearchCatCacheList and ReleaseCatCacheList. Previously, we incremented
and decremented the refcounts of list member tuples along with the list
itself, but that's unnecessary, and very expensive when the list is big.
It's cheaper to change only the list refcount. When we are considering
deleting a cache entry, we have to check not only its own refcount but
its parent list's ... but it's easy to arrange the code so that this
check is not made in any commonly-used paths, so the cost is really nil.
The bigger gain though is to refrain from DLMoveToFront'ing each individual
member tuple each time the list is referenced. To keep some semblance
of fair space management, lists are just marked as used or not since the
last cache cleanout search, and we do a MoveToFront pass only when about
to run a cleanout. In combination, these changes reduce the costs of
SearchCatCacheList and ReleaseCatCacheList from about 4.5% of pgbench
runtime to under 1%, according to my gprof results.
remember the output parameter set for himself. It's a bit of a kluge
but fixing array_in to work in bootstrap mode looks worse.
I removed the separate pg_file_length() function, as it no longer has any
real notational advantage --- you can write (pg_stat_file(...)).length.
should surely be timestamptz not timestamp; fix some but not all of the
holes in check_and_make_absolute(); other minor cleanup. Also put in
the missed catversion bump.
computation. On modern machines this is as fast if not faster, and we
don't have to clog the CPU's L2 cache with a tens-of-KB pointer array.
If we ever decide to adopt a more dynamic allocation method for shared
buffers, we'll probably have to revert this patch, but in the meantime
we might as well save a few bytes and nanoseconds. Per Qingqing Zhou.
whenever we generate a new OID. This prevents occasional duplicate-OID
errors that can otherwise occur once the OID counter has wrapped around.
Duplicate relfilenode values are also checked for when creating new
physical files. Per my recent proposal.
delay and limit, both as global GUCs and as table-specific entries in
pg_autovacuum. stats_reset_on_server_start is now OFF by default,
but a reset is forced if we did WAL replay. XID-wrap vacuums do not
ANALYZE, but do FREEZE if it's a template database. Alvaro Herrera
exit, instead of trying to take shortcuts. Introduce some additional
shutdown callback routines to eliminate kluges like having ProcKill
be responsible for shutting down the buffer manager. Ensure that the
order of operations during shutdown is predictable and what you would
expect given the module layering.
This was not especially critical before, but it is now that we track
ownership dependencies --- the dependency for the rowtype *must* shift
to the new owner. Spotted by Bernd Helmle.
Also fix a problem introduced by recent change to allow non-superusers
to do ALTER OWNER in some cases: if the table had a toast table, ALTER
OWNER failed *even for superusers*, because the test being applied would
conclude that the new would-be owner had no create rights on pg_toast.
A side-effect of the fix is to disallow changing the ownership of indexes
or toast tables separately from their parent table, which seems a good
idea on the whole.
of special case for Windows port. Put a PG_TRY around most of createdb()
to ensure that we remove copied subdirectories on failure, even if the
failure happens while creating the pg_database row. (I think this explains
Oliver Siegmar's recent report.) Having done that, there's no need for
the fragile assumption that copydir() mustn't ereport(ERROR), so simplify
its API. Eliminate the old code that used system("cp ...") to copy
subdirectories, in favor of using copydir() on all platforms. This not
only should allow much better error reporting, but allows us to fsync
the created files before trusting that the copy has succeeded.
track shared relations in a separate hashtable, so that operations done
from different databases are counted correctly. Add proper support for
anti-XID-wraparound vacuuming, even in databases that are never connected
to and so have no stats entries. Miscellaneous other bug fixes.
Alvaro Herrera, some additional fixes by Tom Lane.
planning logic for bitmap indexscans. Partial indexes create corner
cases in which a scan might be done with no explicit index qual conditions,
and the code wasn't handling those cases nicely. Also be a little
tenser about eliminating redundant clauses in the generated plan.
Per report from Dmitry Karasik.
doesn't automatically inherit the privileges of roles it is a member of;
for such a role, membership in another role can be exploited only by doing
explicit SET ROLE. The default inherit setting is TRUE, so by default
the behavior doesn't change, but creating a user with NOINHERIT gives closer
adherence to our current reading of SQL99. Documentation still lacking,
and I think the information schema needs another look.
existing ones for object privileges. Update the information_schema for
roles --- pg_has_role() makes this a whole lot easier, removing the need
for most of the explicit joins with pg_user. The views should be a tad
faster now, too. Stephen Frost and Tom Lane.
near daylight savings time boudaries. This handles it properly, e.g.
test=> select '2005-04-03 04:00:00'::timestamp at time zone
'America/Los_Angeles';
timezone
------------------------
2005-04-03 07:00:00-04
(1 row)
24 hours. This is very helpful for daylight savings time:
select '2005-05-03 00:00:00 EST'::timestamp with time zone + '24 hours';
?column?
----------------------
2005-05-04 01:00:00-04
select '2005-05-03 00:00:00 EST'::timestamp with time zone + '1 day';
?column?
----------------------
2005-05-04 01:00:00-04
Michael Glaesemann
checked that the pointer is actually word-aligned. Casting a non-aligned
pointer to int32* is technically illegal per the C spec, and some recent
versions of gcc actually generate bad code for the memset() when given
such a pointer. Per report from Andrew Morrow.
requiring superuserness always, allow an owner to reassign ownership
to any role he is a member of, if that role would have the right to
create a similar object. These three requirements essentially state
that the would-be alterer has enough privilege to DROP the existing
object and then re-CREATE it as the new role; so we might as well
let him do it in one step. The ALTER TABLESPACE case is a bit
squirrely, but the whole concept of non-superuser tablespace owners
is pretty dubious anyway. Stephen Frost, code review by Tom Lane.
optional arguments as text input functions, ie, typioparam OID and
atttypmod. Make all the datatypes that use typmod enforce it the same
way in typreceive as they do in typinput. This fixes a problem with
failure to enforce length restrictions during COPY FROM BINARY.
The specification of this function is as follows.
regexp_replace(source text, pattern text, replacement text, [flags
text])
returns text
Replace string that matches to regular expression in source text to
replacement text.
- pattern is regular expression pattern.
- replacement is replace string that can use '\1'-'\9', and '\&'.
'\1'-'\9': back reference to the n'th subexpression.
'\&' : entire matched string.
- flags can use the following values:
g: global (replace all)
i: ignore case
When the flags is not specified, case sensitive, replace the first
instance only.
Atsushi Ogawa
XLOG_DBASE_DROP_OLD WAL records -- these records are no longer created in
current sources. Adjust numbering of XLOG_DBASE_CREATE and XLOG_DBASE_DROP
and bump the catversion. Patch from Gavin Sherry, adjusted by Neil Conway.
have adequate mechanisms for tracking the contents of databases and
tablespaces). This solves the longstanding problem that you can drop a
user who still owns objects and/or has access permissions.
Alvaro Herrera, with some kibitzing from Tom Lane.
chdir into PGDATA and subsequently use relative paths instead of absolute
paths to access all files under PGDATA. This seems to give a small
performance improvement, and it should make the system more robust
against naive DBAs doing things like moving a database directory that
has a live postmaster in it. Per recent discussion.
propagated inside an outer join. In particular, given
LEFT JOIN ON (A = B) WHERE A = constant, we cannot conclude that
B = constant at the top level (B might be null instead), but we
can nonetheless put a restriction B = constant into the quals for
B's relation, since no inner-side rows not meeting that condition
can contribute to the final result. Similarly, given
FULL JOIN USING (J) WHERE J = constant, we can't directly conclude
that either input J variable = constant, but it's OK to push such
quals into each input rel. Per recent gripe from Kim Bisgaard.
Along the way, remove 'valid_everywhere' flag from RestrictInfo,
as on closer analysis it was not being used for anything, and was
defined backwards anyway.
basic regression tests for GiST to the standard regression tests.
I took the opportunity to add an rtree-equivalent gist opclass for
circles; the contrib version only covered boxes and polygons, but
indexing circles is very handy for distance searches.
the difference between checkpoints forced due to WAL segment consumption
and checkpoints forced for other reasons (such as CREATE DATABASE). Avoid
generating 'checkpoints are occurring too frequently' messages when the
checkpoint wasn't caused by WAL segment consumption. Per gripe from
Chris K-L.
current time: provide a GetCurrentTimestamp() function that returns
current time in the form of a TimestampTz, instead of separate time_t
and microseconds fields. This is what all the callers really want
anyway, and it eliminates low-level dependencies on AbsoluteTime,
which is a deprecated datatype that will have to disappear eventually.
role memberships; make superuser/createrole distinction do something
useful; fix some locking and CommandCounterIncrement issues; prevent
creation of loops in the membership graph.
syntactic conflicts, both privilege and role GRANT/REVOKE commands have
to use the same production for scanning the list of tokens that might
eventually turn out to be privileges or role names. So, change the
existing GRANT/REVOKE code to expect a list of strings not pre-reduced
AclMode values. Fix a couple other minor issues while at it, such as
InitializeAcl function name conflicting with a Windows system function.
and pg_auth_members. There are still many loose ends to finish in this
patch (no documentation, no regression tests, no pg_dump support for
instance). But I'm going to commit it now anyway so that Alvaro can
make some progress on shared dependencies. The catalog changes should
be pretty much done.
- full concurrency for insert/update/select/vacuum:
- select and vacuum never locks more than one page simultaneously
- select (gettuple) hasn't any lock across it's calls
- insert never locks more than two page simultaneously:
- during search of leaf to insert it locks only one page
simultaneously
- while walk upward to the root it locked only parent (may be
non-direct parent) and child. One of them X-lock, another may
be S- or X-lock
- 'vacuum full' locks index
- improve gistgetmulti
- simplify XLOG records
Fix bug in index_beginscan_internal: LockRelation may clean
rd_aminfo structure, so move GET_REL_PROCEDURE after LockRelation
with a table that has a small predicted size. Avoids wasting several
hundred K on the timezone hash table, which is likely to have only one
or a few entries, but the entries use up 10Kb apiece ...
with main, avoid using a SQL-defined SQLSTATE for what is most definitely
not a SQL-compatible error condition, fix documentation omissions,
adhere to message style guidelines, don't use two GUC_REPORT variables
when one is sufficient. Nothing done about pg_dump issues.
literally.
Add GUC variables:
"escape_string_warning" - warn about backslashes in non-E strings
"escape_string_syntax" - supports E'' syntax?
"standard_compliant_strings" - treats backslashes literally in ''
Update code to use E'' when escapes are used.
to the existing X-direction tests. An rtree class now includes 4 actual
2-D tests, 4 1-D X-direction tests, and 4 1-D Y-direction tests.
This involved adding four new Y-direction test operators for each of
box and polygon; I followed the PostGIS project's lead as to the names
of these operators.
NON BACKWARDS COMPATIBLE CHANGE: the poly_overleft (&<) and poly_overright
(&>) operators now have semantics comparable to box_overleft and box_overright.
This is necessary to make r-tree indexes work correctly on polygons.
Also, I changed circle_left and circle_right to agree with box_left and
box_right --- formerly they allowed the boundaries to touch. This isn't
actually essential given the lack of any r-tree opclass for circles, but
it seems best to sync all the definitions while we are at it.
polygon operators (<<, &<, >>, &>). Per ideas originally put forward
by andrew@supernews and later rediscovered by moi. This patch just
fixes the existing opclasses, and does not add any new behavior as I
proposed earlier; that can be sorted out later. In principle this
could be back-patched, since it changes only search behavior and not
system catalog entries nor rtree index contents. I'm not currently
planning to do that, though, since I think it could use more testing.
in the database. The old behavior (reindex system catalogs only) is now
available as REINDEX SYSTEM. I did not add the complementary REINDEX USER
case since there did not seem to be consensus for this, but it would be
trivial to add later. Per recent discussions.
unlike template0 and template1 does not have any special status in
terms of backend functionality. However, all external utilities such
as createuser and createdb now connect to "postgres" instead of
template1, and the documentation is changed to encourage people to use
"postgres" instead of template1 as a play area. This should fix some
longstanding gotchas involving unexpected propagation of database
objects by createdb (when you used template1 without understanding
the implications), as well as ameliorating the problem that CREATE
DATABASE is unhappy if anyone else is connected to template1.
Patch by Dave Page, minor editing by Tom Lane. All per recent
pghackers discussions.
(a/k/a SELECT INTO). Instead, flush and fsync the whole relation before
committing. We do still need the WAL log when PITR is active, however.
Simon Riggs and Tom Lane.
2. improve vacuum for gist
- use FSM
- full vacuum:
- reforms parent tuple if it's needed
( tuples was deleted on child page or parent tuple remains invalid
after crash recovery )
- truncate index file if possible
3. fixes bugs and mistakes