postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-02-16 09:08:23 -05:00

Author	SHA1	Message	Date
Robert Haas	0ba3f3bc65	Comment improvements for abbreviated keys. Peter Geoghegan and Robert Haas	2015-12-22 13:57:18 -05:00
Tom Lane	ec7eef6b11	Avoid caching expression state trees for domain constraints across queries. In commit `8abb3cda0d` I attempted to cache the expression state trees constructed for domain CHECK constraints for the life of the backend (assuming the domain's constraints don't get redefined). However, this turns out not to work very well, because execQual.c will run those state trees with ecxt_per_query_memory pointing to a query-lifespan context, and in some situations we'll end up with pointers into that context getting stored into the state trees. This happens in particular with SQL-language functions, as reported by Emre Hasegeli, but there are many other cases. To fix, keep only the expression plan trees for domain CHECK constraints in the typcache's data structure, and revert to performing ExecInitExpr (at least) once per query to set up expression state trees in the query's context. Eventually it'd be nice to undo this, but that will require some careful thought about memory management for expression state trees, and it seems far too late for any such redesign in 9.5. This way is still much more efficient than what happened before `8abb3cda0`.	2015-11-29 18:18:42 -05:00
Tom Lane	074c5cfbfb	Fix handling of inherited check constraints in ALTER COLUMN TYPE (again). The previous way of reconstructing check constraints was to do a separate "ALTER TABLE ONLY tab ADD CONSTRAINT" for each table in an inheritance hierarchy. However, that way has no hope of reconstructing the check constraints' own inheritance properties correctly, as pointed out in bug #13779 from Jan Dirk Zijlstra. What we should do instead is to do a regular "ALTER TABLE", allowing recursion, at the topmost table that has a particular constraint, and then suppress the work queue entries for inherited instances of the constraint. Annoyingly, we'd tried to fix this behavior before, in commit `5ed6546cf`, but we failed to notice that it wasn't reconstructing the pg_constraint field values correctly. As long as I'm touching pg_get_constraintdef_worker anyway, tweak it to always schema-qualify the target table name; this seems like useful backup to the protections installed by commit `5f173040`. In HEAD/9.5, get rid of get_constraint_relation_oids, which is now unused. (I could alternatively have modified it to also return conislocal, but that seemed like a pretty single-purpose API, so let's not pretend it has some other use.) It's unused in the back branches as well, but I left it in place just in case some third-party code has decided to use it. In HEAD/9.5, also rename pg_get_constraintdef_string to pg_get_constraintdef_command, as the previous name did nothing to explain what that entry point did differently from others (and its comment was equally useless). Again, that change doesn't seem like material for back-patching. I did a bit of re-pgindenting in tablecmds.c in HEAD/9.5, as well. Otherwise, back-patch to all supported branches.	2015-11-20 14:55:47 -05:00
Robert Haas	80558c1f5a	Generate parallel sequential scan plans in simple cases. Add a new flag, consider_parallel, to each RelOptInfo, indicating whether a plan for that relation could conceivably be run inside of a parallel worker. Right now, we're pretty conservative: for example, it might be possible to defer applying a parallel-restricted qual in a worker, and later do it in the leader, but right now we just don't try to parallelize access to that relation. That's probably the right decision in most cases, anyway. Using the new flag, generate parallel sequential scan plans for plain baserels, meaning that we now have parallel sequential scan in PostgreSQL. The logic here is pretty unsophisticated right now: the costing model probably isn't right in detail, and we can't push joins beneath Gather nodes, so the number of plans that can actually benefit from this is pretty limited right now. Lots more work is needed. Nevertheless, it seems time to enable this functionality so that all this code can actually be tested easily by users and developers. Note that, if you wish to test this functionality, it will be necessary to set max_parallel_degree to a value greater than the default of 0. Once a few more loose ends have been tidied up here, we might want to consider changing the default value of this GUC, but I'm leaving it alone for now. Along the way, fix a bug in cost_gather: the previous coding thought that a Gather node's transfer overhead should be costed on the basis of the relation size rather than the number of tuples that actually need to be passed off to the leader. Patch by me, reviewed in earlier versions by Amit Kapila.	2015-11-11 09:02:52 -05:00
Tom Lane	c5e86ea932	Add "xid <> xid" and "xid <> int4" operators. The corresponding "=" operators have been there a long time, and not having their negators is a bit of a nuisance. Michael Paquier	2015-11-07 16:40:15 -05:00
Robert Haas	a76ef15d9f	Add sort support routine for the UUID data type. This introduces a simple encoding scheme to produce abbreviated keys: pack as many bytes of each UUID as will fit into a Datum. On little-endian machines, a byteswap is also performed; the abbreviated comparator can therefore just consist of a simple 3-way unsigned integer comparison. The purpose of this change is to speed up sorting data on a column of type UUID. Peter Geoghegan	2015-11-06 12:14:35 -05:00
Tom Lane	d894941663	Allow postgres_fdw to ship extension funcs/operators for remote execution. The user can whitelist specified extension(s) in the foreign server's options, whereupon we will treat immutable functions and operators of those extensions as candidates to be sent for remote execution. Whitelisting an extension in this way basically promises that the extension exists on the remote server and behaves compatibly with the local instance. We have no way to prove that formally, so we have to rely on the user to get it right. But this seems like something that people can usually get right in practice. We might in future allow functions and operators to be whitelisted individually, but extension granularity is a very convenient special case, so it got done first. The patch as-committed lacks any regression tests, which is unfortunate, but introducing dependencies on other extensions for testing purposes would break "make installcheck" scenarios, which is worse. I have some ideas about klugy ways around that, but it seems like material for a separate patch. For the moment, leave the problem open. Paul Ramsey, hacked up a bit more by me	2015-11-03 18:42:18 -05:00
Tom Lane	d435542583	Fix incorrect translation of minus-infinity datetimes for json/jsonb. Commit `bda76c1c8c` caused both plus and minus infinity to be rendered as "infinity", which is not only wrong but inconsistent with the pre-9.4 behavior of to_json(). Fix that by duplicating the coding in date_out/timestamp_out/timestamptz_out more closely. Per bug #13687 from Stepan Perlov. Back-patch to 9.4, like the previous commit. In passing, also re-pgindent json.c, since it had gotten a bit messed up by recent patches (and I was already annoyed by indentation-related problems in back-patching this fix ...)	2015-10-20 11:07:04 -07:00
Robert Haas	fd5eaad715	Correct pg_indent to pgindent in various comments. David Christensen	2015-10-08 12:27:54 -04:00
Peter Eisentraut	6390c8c654	Group cluster_name and update_process_title settings together	2015-10-04 12:29:36 -04:00
Robert Haas	d1b7c1ffe7	Parallel executor support. This code provides infrastructure for a parallel leader to start up parallel workers to execute subtrees of the plan tree being executed in the master. User-supplied parameters from ParamListInfo are passed down, but PARAM_EXEC parameters are not. Various other constructs, such as initplans, subplans, and CTEs, are also not currently shared. Nevertheless, there's enough here to support a basic implementation of parallel query, and we can lift some of the current restrictions as needed. Amit Kapila and Robert Haas	2015-09-28 21:55:57 -04:00
Tom Lane	39df0f150c	Allow planner to use expression-index stats for function calls in WHERE. Previously, a function call appearing at the top level of WHERE had a hard-wired selectivity estimate of 0.3333333, a kludge conveniently dated in the source code itself to July 1992. The expectation at the time was that somebody would soon implement estimator support functions analogous to those for operators; but no such code has appeared, nor does it seem likely to in the near future. We do have an alternative solution though, at least for immutable functions on single relations: creating an expression index on the function call will allow ANALYZE to gather stats about the function's selectivity. But the code in clause_selectivity() failed to make use of such data even if it exists. Refactor so that that will happen. I chose to make it try this technique for any clause type for which clause_selectivity() doesn't have a special case, not just functions. To avoid adding unnecessary overhead in the common case where we don't learn anything new, make selfuncs.c provide an API that hooks directly to examine_variable() and then var_eq_const(), rather than the previous coding which laboriously constructed an OpExpr only so that it could be expensively deconstructed again. I preserved the behavior that the default estimate for a function call is 0.3333333. (For any other expression node type, it's 0.5, as before.) I had originally thought to make the default be 0.5 across the board, but changing a default estimate that's survived for twenty-three years seems like something not to do without a lot more testing than I care to put into it right now. Per a complaint from Jehan-Guillaume de Rorthais. Back-patch into 9.5, but not further, at least for the moment.	2015-09-24 18:35:46 -04:00
Noah Misch	7f11724bd6	Remove the SECURITY_ROW_LEVEL_DISABLED security context bit. This commit's parent made superfluous the bit's sole usage. Referential integrity checks have long run as the subject table's owner, and that now implies RLS bypass. Safe use of the bit was tricky, requiring strict control over the SQL expressions evaluating therein. Back-patch to 9.5, where the bit was introduced. Based on a patch by Stephen Frost.	2015-09-20 20:47:17 -04:00
Noah Misch	537bd178c7	Remove the row_security=force GUC value. Every query of a single ENABLE ROW SECURITY table has two meanings, with the row_security GUC selecting between them. With row_security=force available, every function author would have been advised to either set the GUC locally or test both meanings. Non-compliance would have threatened reliability and, for SECURITY DEFINER functions, security. Authors already face an obligation to account for search_path, and we should not mimic that example. With this change, only BYPASSRLS roles need exercise the aforementioned care. Back-patch to 9.5, where the row_security GUC was introduced. Since this narrows the domain of pg_db_role_setting.setconfig and pg_proc.proconfig, one might bump catversion. A row_security=force setting in one of those columns will elicit a clear message, so don't.	2015-09-20 20:45:41 -04:00
Robert Haas	7aea8e4f2d	Determine whether it's safe to attempt a parallel plan for a query. Commit `924bcf4f16` introduced a framework for parallel computation in PostgreSQL that makes most but not all built-in functions safe to execute in parallel mode. In order to have parallel query, we'll need to be able to determine whether that query contains functions (either built-in or user-defined) that cannot be safely executed in parallel mode. This requires those functions to be labeled, so this patch introduces an infrastructure for that. Some functions currently labeled as safe may need to be revised depending on how pending issues related to heavyweight locking under paralllelism are resolved. Parallel plans can't be used except for the case where the query will run to completion. If portal execution were suspended, the parallel mode restrictions would need to remain in effect during that time, but that might make other queries fail. Therefore, this patch introduces a framework that enables consideration of parallel plans only when it is known that the plan will be run to completion. This probably needs some refinement; for example, at bind time, we do not know whether a query run via the extended protocol will be execution to completion or run with a limited fetch count. Having the client indicate its intentions at bind time would constitute a wire protocol break. Some contexts in which parallel mode would be safe are not adjusted by this patch; the default is not to try parallel plans except from call sites that have been updated to say that such plans are OK. This commit doesn't introduce any parallel paths or plans; it just provides a way to determine whether they could potentially be used. I'm committing it on the theory that the remaining parallel sequential scan patches will also get committed to this release, hopefully in the not-too-distant future. Robert Haas and Amit Kapila. Reviewed (in earlier versions) by Noah Misch.	2015-09-16 15:38:47 -04:00
Alvaro Herrera	1aba62ec63	Allow per-tablespace effective_io_concurrency Per discussion, nowadays it is possible to have tablespaces that have wildly different I/O characteristics from others. Setting different effective_io_concurrency parameters for those has been measured to improve performance. Author: Julien Rouhaud Reviewed by: Andres Freund	2015-09-08 12:51:42 -03:00
Tom Lane	c5454f99c4	Fix subtransaction cleanup after an outer-subtransaction portal fails. Formerly, we treated only portals created in the current subtransaction as having failed during subtransaction abort. However, if the error occurred while running a portal created in an outer subtransaction (ie, a cursor declared before the last savepoint), that has to be considered broken too. To allow reliable detection of which ones those are, add a bookkeeping field to struct Portal that tracks the innermost subtransaction in which each portal has actually been executed. (Without this, we'd end up failing portals containing functions that had called the subtransaction, thereby breaking plpgsql exception blocks completely.) In addition, when we fail an outer-subtransaction Portal, transfer its resources into the subtransaction's resource owner, so that they're released early in cleanup of the subxact. This fixes a problem reported by Jim Nasby in which a function executed in an outer-subtransaction cursor could cause an Assert failure or crash by referencing a relation created within the inner subtransaction. The proximate cause of the Assert failure is that AtEOSubXact_RelationCache assumed it could blow away a relcache entry without first checking that the entry had zero refcount. That was a bad idea on its own terms, so add such a check there, and to the similar coding in AtEOXact_RelationCache. This provides an independent safety measure in case there are still ways to provoke the situation despite the Portal-level changes. This has been broken since subtransactions were invented, so back-patch to all supported branches. Tom Lane and Michael Paquier	2015-09-04 13:37:14 -04:00
Tom Lane	7b5ef8f2d0	Limit the verbosity of memory context statistics dumps. We had a report from Stefan Kaltenbrunner of a case in which postmaster log files overran available disk space because multiple backends spewed enormous context stats dumps upon hitting an out-of-memory condition. Given the lack of similar reports, this isn't a common problem, but it still seems worth doing something about. However, we don't want to just blindly truncate the output, because that might prevent diagnosis of OOM problems. What seems like a workable compromise is to limit the dump to 100 child contexts per parent, and summarize the space used within any additional child contexts. That should help because practical cases where the dump gets long will typically be huge numbers of siblings under the same parent context; while the additional debugging value from seeing details about individual siblings beyond 100 will not be large, we hope. Anyway it doesn't take much code or memory space to do this, so let's try it like this and see how things go. Since the summarization mechanism requires passing totals back up anyway, I took the opportunity to add a "grand total" line to the end of the printout.	2015-08-25 13:09:48 -04:00
Andres Freund	de6fd1c898	Rely on inline functions even if that causes warnings in older compilers. So far we have worked around the fact that some very old compilers do not support 'inline' functions by only using inline functions conditionally (or not at all). Since such compilers are very rare by now, we have decided to rely on inline functions from 9.6 onwards. To avoid breaking these old compilers inline is defined away when not supported. That'll cause "function x defined but not used" type of warnings, but since nobody develops on such compilers anymore that's ok. This change in policy will allow us to more easily employ inline functions. I chose to remove code previously conditional on PG_USE_INLINE as it seemed confusing to have code dependent on a define that's always defined. Blacklisting of compilers, like in `c53f73879f`, now has to be done differently. A platform template can define PG_FORCE_DISABLE_INLINE to force inline to be defined empty. Discussion: 20150701161447.GB30708@awork2.anarazel.de	2015-08-05 18:19:52 +02:00
Tom Lane	8ea3e7a75c	Fix bogus "out of memory" reports in tuplestore.c. The tuplesort/tuplestore memory management logic assumed that the chunk allocation overhead for its memtuples array could not increase when increasing the array size. This is and always was true for tuplesort, but we (I, I think) blindly copied that logic into tuplestore.c without noticing that the assumption failed to hold for the much smaller array elements used by tuplestore. Given rather small work_mem, this could result in an improper complaint about "unexpected out-of-memory situation", as reported by Brent DeSpain in bug #13530. The easiest way to fix this is just to increase tuplestore's initial array size so that the assumption holds. Rather than relying on magic constants, though, let's export a #define from aset.c that represents the safe allocation threshold, and make tuplestore's calculation depend on that. Do the same in tuplesort.c to keep the logic looking parallel, even though tuplesort.c isn't actually at risk at present. This will keep us from breaking it if we ever muck with the allocation parameters in aset.c. Back-patch to all supported versions. The error message doesn't occur pre-9.3, not so much because the problem can't happen as because the pre-9.3 tuplestore code neglected to check for it. (The chance of trouble is a great deal larger as of 9.3, though, due to changes in the array-size-increasing strategy.) However, allowing LACKMEM() to become true unexpectedly could still result in less-than-desirable behavior, so let's patch it all the way back.	2015-08-04 18:18:46 -04:00
Joe Conway	7b4bfc87d5	Plug RLS related information leak in pg_stats view. The pg_stats view is supposed to be restricted to only show rows about tables the user can read. However, it sometimes can leak information which could not otherwise be seen when row level security is enabled. Fix that by not showing pg_stats rows to users that would be subject to RLS on the table the row is related to. This is done by creating/using the newly introduced SQL visible function, row_security_active(). Along the way, clean up three call sites of check_enable_rls(). The second argument of that function should only be specified as other than InvalidOid when we are checking as a different user than the current one, as in when querying through a view. These sites were passing GetUserId() instead of InvalidOid, which can cause the function to return incorrect results if the current user has the BYPASSRLS privilege and row_security has been set to OFF. Additionally fix a bug causing RI Trigger error messages to unintentionally leak information when RLS is enabled, and other minor cleanup and improvements. Also add WITH (security_barrier) to the definition of pg_stats. Bumped CATVERSION due to new SQL functions and pg_stats view definition. Back-patch to 9.5 where RLS was introduced. Reported by Yaroslav. Patch by Joe Conway and Dean Rasheed with review and input by Michael Paquier and Stephen Frost.	2015-07-28 13:21:22 -07:00
Tom Lane	dd7a8f66ed	Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.	2015-07-25 14:39:00 -04:00
Andrew Dunstan	e02d44b8a7	Support JSON negative array subscripts everywhere Previously, there was an inconsistency across json/jsonb operators that operate on datums containing JSON arrays -- only some operators supported negative array count-from-the-end subscripting. Specifically, only a new-to-9.5 jsonb deletion operator had support (the new "jsonb - integer" operator). This inconsistency seemed likely to be counter-intuitive to users. To fix, allow all places where the user can supply an integer subscript to accept a negative subscript value, including path-orientated operators and functions, as well as other extraction operators. This will need to be called out as an incompatibility in the 9.5 release notes, since it's possible that users are relying on certain established extraction operators changed here yielding NULL in the event of a negative subscript. For the json type, this requires adding a way of cheaply getting the total JSON array element count ahead of time when parsing arrays with a negative subscript involved, necessitating an ad-hoc lex and parse. This is followed by a "conversion" from a negative subscript to its equivalent positive-wise value using the count. From there on, it's as if a positive-wise value was originally provided. Note that there is still a minor inconsistency here across jsonb deletion operators. Unlike the aforementioned new "-" deletion operator that accepts an integer on its right hand side, the new "#-" path orientated deletion variant does not throw an error when it appears like an array subscript (input that could be recognized by as an integer literal) is being used on an object, which is wrong-headed. The reason for not being stricter is that it could be the case that an object pair happens to have a key value that looks like an integer; in general, these two possibilities are impossible to differentiate with rhs path text[] argument elements. However, we still don't allow the "#-" path-orientated deletion operator to perform array-style subscripting. Rather, we just return the original left operand value in the event of a negative subscript (which seems analogous to how the established "jsonb/json #> text[]" path-orientated operator may yield NULL in the event of an invalid subscript). In passing, make SetArrayPath() stricter about not accepting cases where there is trailing non-numeric garbage bytes rather than a clean NUL byte. This means, for example, that strings like "10e10" are now not accepted as an array subscript of 10 by some new-to-9.5 path-orientated jsonb operators (e.g. the new #- operator). Finally, remove dead code for jsonb subscript deletion; arguably, this should have been done in commit `b81c7b409`. Peter Geoghegan and Andrew Dunstan	2015-07-17 21:13:47 -04:00
Fujii Masao	6ba365aa46	Fix obsolete comment regarding NOTICE message level. By default NOTICE message is not sent to server log because the default value of log_min_messages is WARNING since 8.4. Pavel Stehule	2015-07-09 22:52:36 +09:00
Noah Misch	be8b06c364	Revoke support for strxfrm() that write past the specified array length. This formalizes a decision implicit in commit `4ea51cdfe8` and adds clean detection of affected systems. Vendor updates are available for each such known bug. Back-patch to 9.5, where the aforementioned commit first appeared.	2015-07-08 20:44:21 -04:00
Tom Lane	10fb48d66d	Add an optional missing_ok argument to SQL function current_setting(). This allows convenient checking for existence of a GUC from SQL, which is particularly useful when dealing with custom variables. David Christensen, reviewed by Jeevan Chalke	2015-07-02 16:41:07 -04:00
Tom Lane	62d16c7fc5	Improve design and implementation of pg_file_settings view. As first committed, this view reported on the file contents as they were at the last SIGHUP event. That's not as useful as reporting on the current contents, and what's more, it didn't work right on Windows unless the current session had serviced at least one SIGHUP. Therefore, arrange to re-read the files when pg_show_all_settings() is called. This requires only minor refactoring so that we can pass changeVal = false to set_config_option() so that it won't actually apply any changes locally. In addition, add error reporting so that errors that would prevent the configuration files from being loaded, or would prevent individual settings from being applied, are visible directly in the view. This makes the view usable for pre-testing whether edits made in the config files will have the desired effect, before one actually issues a SIGHUP. I also added an "applied" column so that it's easy to identify entries that are superseded by later entries; this was the main use-case for the original design, but it seemed unnecessarily hard to use for that. Also fix a 9.4.1 regression that allowed multiple entries for a PGC_POSTMASTER variable to cause bogus complaints in the postmaster log. (The issue here was that commit `bf007a27ac` unintentionally reverted `3e3f65973a`, which suppressed any duplicate entries within ParseConfigFp. However, since the original coding of the pg_file_settings view depended on such suppression not happening, we couldn't have fixed this issue now without first doing something with pg_file_settings. Now we suppress duplicates by marking them "ignored" within ProcessConfigFileInternal, which doesn't hide them in the view.) Lesser changes include: Drive the view directly off the ConfigVariable list, instead of making a basically-equivalent second copy of the data. There's no longer any need to hang onto the data permanently, anyway. Convert show_all_file_settings() to do its work in one call and return a tuplestore; this avoids risks associated with assuming that the GUC state will hold still over the course of query execution. (I think there were probably latent bugs here, though you might need something like a cursor on the view to expose them.) Arrange to run SIGHUP processing in a short-lived memory context, to forestall process-lifespan memory leaks. (There is one known leak in this code, in ProcessConfigDirectory; it seems minor enough to not be worth back-patching a specific fix for.) Remove mistaken assignment to ConfigFileLineno that caused line counting after an include_dir directive to be completely wrong. Add missed failure check in AlterSystemSetConfigFile(). We don't really expect ParseConfigFp() to fail, but that's not an excuse for not checking.	2015-06-28 18:06:14 -04:00
Heikki Linnakangas	cb2acb1081	Add missing_ok option to the SQL functions for reading files. This makes it possible to use the functions without getting errors, if there is a chance that the file might be removed or renamed concurrently. pg_rewind needs to do just that, although this could be useful for other purposes too. (The changes to pg_rewind to use these functions will come in a separate commit.) The read_binary_file() function isn't very well-suited for extensions.c's purposes anymore, if it ever was. So bite the bullet and make a copy of it in extension.c, tailored for that use case. This seems better than the accidental code reuse, even if it's a some more lines of code. Michael Paquier, with plenty of kibitzing by me.	2015-06-28 21:35:46 +03:00
Kevin Grittner	604e99396d	Add opaque declaration of HTAB to tqual.h. Commit `b89e151054` added the ResolveCminCmaxDuringDecoding declaration to tqual.h, which uses an HTAB parameter, without declaring HTAB. It accidentally fails to fail to build with current sources because a declaration happens to be included, directly or indirectly, in all source files that currently use tqual.h before tqual.h is first included, but we shouldn't count on that. Since an opaque declaration is enough here, just use that, as was done in snapmgr.h. Backpatch to 9.4, where the HTAB reference was added to tqual.h.	2015-06-27 09:55:06 -05:00
Tom Lane	5d1ff6bd55	Fix the logic for putting relations into the relcache init file. Commit `f3b5565dd4` was a couple of bricks shy of a load; specifically, it missed putting pg_trigger_tgrelid_tgname_index into the relcache init file, because that index is not used by any syscache. However, we have historically nailed that index into cache for performance reasons. The upshot was that load_relcache_init_file always decided that the init file was busted and silently ignored it, resulting in a significant hit to backend startup speed. To fix, reinstantiate RelationIdIsInInitFile() as a wrapper around RelationSupportsSysCache(), which can know about additional relations that should be in the init file despite being unknown to syscache.c. Also install some guards against future mistakes of this type: make write_relcache_init_file Assert that all nailed relations get written to the init file, and make load_relcache_init_file emit a WARNING if it takes the "wrong number of nailed relations" exit path. Now that we remove the init files during postmaster startup, that case should never occur in the field, even if we are starting a minor-version update that added or removed rels from the nailed set. So the warning shouldn't ever be seen by end users, but it will show up in the regression tests if somebody breaks this logic. Back-patch to all supported branches, like the previous commit.	2015-06-25 14:39:05 -04:00
Kevin Grittner	870681017a	Fix typo in comment. Backpatch to 9.4 to minimize possible conflicts.	2015-06-10 17:03:56 -05:00
Tom Lane	f3b5565dd4	Use a safer method for determining whether relcache init file is stale. When we invalidate the relcache entry for a system catalog or index, we must also delete the relcache "init file" if the init file contains a copy of that rel's entry. The old way of doing this relied on a specially maintained list of the OIDs of relations present in the init file: we made the list either when reading the file in, or when writing the file out. The problem is that when writing the file out, we included only rels present in our local relcache, which might have already suffered some deletions due to relcache inval events. In such cases we correctly decided not to overwrite the real init file with incomplete data --- but we still used the incomplete initFileRelationIds list for the rest of the current session. This could result in wrong decisions about whether the session's own actions require deletion of the init file, potentially allowing an init file created by some other concurrent session to be left around even though it's been made stale. Since we don't support changing the schema of a system catalog at runtime, the only likely scenario in which this would cause a problem in the field involves a "vacuum full" on a catalog concurrently with other activity, and even then it's far from easy to provoke. Remarkably, this has been broken since 2002 (in commit `7863404417`), but we had never seen a reproducible test case until recently. If it did happen in the field, the symptoms would probably involve unexpected "cache lookup failed" errors to begin with, then "could not open file" failures after the next checkpoint, as all accesses to the affected catalog stopped working. Recovery would require manually removing the stale "pg_internal.init" file. To fix, get rid of the initFileRelationIds list, and instead consult syscache.c's list of relations used in catalog caches to decide whether a relation is included in the init file. This should be a tad more efficient anyway, since we're replacing linear search of a list with ~100 entries with a binary search. It's a bit ugly that the init file contents are now so directly tied to the catalog caches, but in practice that won't make much difference. Back-patch to all supported branches.	2015-06-07 15:32:09 -04:00
Andrew Dunstan	37def42245	Rename jsonb_replace to jsonb_set and allow it to add new values The function is given a fourth parameter, which defaults to true. When this parameter is true, if the last element of the path is missing in the original json, jsonb_set creates it in the result and assigns it the new value. If it is false then the function does nothing unless all elements of the path are present, including the last. Based on some original code from Dmitry Dolgov, heavily modified by me. Catalog version bumped.	2015-05-31 20:34:10 -04:00
Tom Lane	da33a3894e	Revert exporting of internal GUC variable "data_directory". This undoes a poorly-thought-out choice in commit `970a18687f`, namely to export guc.c's internal variable data_directory. The authoritative variable so far as C code is concerned is DataDir; there is no reason for anything except specific bits of GUC code to look at the GUC variable. After yesterday's commits fixing the fsync-on-restart patch, the only remaining misuse of data_directory was in AlterSystemSetConfigFile(), which would be much better off just using a relative path anyhow: it's less code and it doesn't break if the DBA moves the data directory of a running system, which is a case we've taken some pains over in the past. This is mostly cosmetic, so no need for a back-patch (and I'd be hesitant to remove a global variable in stable branches anyway).	2015-05-29 11:57:33 -04:00
Tom Lane	91e79260f6	Remove no-longer-required function declarations. Remove a bunch of "extern Datum foo(PG_FUNCTION_ARGS);" declarations that are no longer needed now that PG_FUNCTION_INFO_V1(foo) provides that. Some of these were evidently missed in commit `e7128e8dbb`, but others were cargo-culted in in code added since then. Possibly that can be blamed in part on the fact that we'd not fixed relevant documentation examples, which I've now done.	2015-05-24 12:20:23 -04:00
Bruce Momjian	807b9e0dff	pgindent run for 9.5	2015-05-23 21:35:49 -04:00
Andrew Dunstan	5302760a50	Unpack jbvBinary objects passed to pushJsonbValue pushJsonbValue was accepting jbvBinary objects passed as WJB_ELEM or WJB_VALUE data. While this succeeded, when those objects were later encountered in attempting to convert the result to Jsonb, errors occurred. With this change we ghuarantee that a JSonbValue constructed from calls to pushJsonbValue does not contain any jbvBinary objects. This cures a problem observed with jsonb_delete. This means callers of pushJsonbValue no longer need to perform this unpacking themselves. A subsequent patch will perform some cleanup in that area. The error was not triggered by any 9.4 code, but this is a publicly visible routine, and so the error could be exercised by third party code, therefore backpatch to 9.4. Bug report from Peter Geoghegan, fix by me.	2015-05-22 10:21:41 -04:00
Heikki Linnakangas	fa60fb63e5	Fix more typos in comments. Patch by CharSyam, plus a few more I spotted with grep.	2015-05-20 19:45:43 +03:00
Tom Lane	4db485e75b	Put back a backwards-compatible version of sampling support functions. Commit `83e176ec18` removed the longstanding support functions for block sampling without any consideration of the impact this would have on third-party FDWs. The new API is not notably more functional for FDWs than the old, so forcing them to change doesn't seem like a good thing. We can provide the old API as a wrapper (more or less) around the new one for a minimal amount of extra code.	2015-05-18 18:34:37 -04:00
Andres Freund	f3d3118532	Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com	2015-05-16 03:46:31 +02:00
Alvaro Herrera	26df7066cc	Move strategy numbers to include/access/stratnum.h For upcoming BRIN opclasses, it's convenient to have strategy numbers defined in a single place. Since there's nothing appropriate, create it. The StrategyNumber typedef now lives there, as well as existing strategy numbers for B-trees (from skey.h) and R-tree-and-friends (from gist.h). skey.h is forced to include stratnum.h because of the StrategyNumber typedef, but gist.h is not; extensions that currently rely on gist.h for rtree strategy numbers might need to add a new A few .c files can stop including skey.h and/or gist.h, which is a nice side benefit. Per discussion: https://www.postgresql.org/message-id/20150514232132.GZ2523@alvh.no-ip.org Authored by Emre Hasegeli and Álvaro. (It's not clear to me why bootscanner.l has any #include lines at all.)	2015-05-15 17:03:16 -03:00
Simon Riggs	f6d208d6e5	TABLESAMPLE, SQL Standard and extensible Add a TABLESAMPLE clause to SELECT statements that allows user to specify random BERNOULLI sampling or block level SYSTEM sampling. Implementation allows for extensible sampling functions to be written, using a standard API. Basic version follows SQLStandard exactly. Usable concrete use cases for the sampling API follow in later commits. Petr Jelinek Reviewed by Michael Paquier and Simon Riggs	2015-05-15 14:37:10 -04:00
Heikki Linnakangas	35fcb1b3d0	Allow GiST distance function to return merely a lower-bound. The distance function can now set *recheck = false, like index quals. The executor will then re-check the ORDER BY expressions, and use a queue to reorder the results on the fly. This makes it possible to do kNN-searches on polygons and circles, which don't store the exact value in the index, but just a bounding box. Alexander Korotkov and me	2015-05-15 14:26:51 +03:00
Simon Riggs	83e176ec18	Separate block sampling functions Refactoring ahead of tablesample patch Requested and reviewed by Michael Paquier Petr Jelinek	2015-05-15 04:02:54 +02:00
Peter Eisentraut	a486e35706	Add pg_settings.pending_restart column with input from David G. Johnston, Robert Haas, Michael Paquier	2015-05-14 20:08:51 -04:00
Tom Lane	1dc5ebc907	Support "expanded" objects, particularly arrays, for better performance. This patch introduces the ability for complex datatypes to have an in-memory representation that is different from their on-disk format. On-disk formats are typically optimized for minimal size, and in any case they can't contain pointers, so they are often not well-suited for computation. Now a datatype can invent an "expanded" in-memory format that is better suited for its operations, and then pass that around among the C functions that operate on the datatype. There are also provisions (rudimentary as yet) to allow an expanded object to be modified in-place under suitable conditions, so that operations like assignment to an element of an array need not involve copying the entire array. The initial application for this feature is arrays, but it is not hard to foresee using it for other container types like JSON, XML and hstore. I have hopes that it will be useful to PostGIS as well. In this initial implementation, a few heuristics have been hard-wired into plpgsql to improve performance for arrays that are stored in plpgsql variables. We would like to generalize those hacks so that other datatypes can obtain similar improvements, but figuring out some appropriate APIs is left as a task for future work. (The heuristics themselves are probably not optimal yet, either, as they sometimes force expansion of arrays that would be better left alone.) Preliminary performance testing shows impressive speed gains for plpgsql functions that do element-by-element access or update of large arrays. There are other cases that get a little slower, as a result of added array format conversions; but we can hope to improve anything that's annoyingly bad. In any case most applications should see a net win. Tom Lane, reviewed by Andres Freund	2015-05-14 12:08:49 -04:00
Andrew Dunstan	c6947010ce	Additional functions and operators for jsonb jsonb_pretty(jsonb) produces nicely indented json output. jsonb \|\| jsonb concatenates two jsonb values. jsonb - text removes a key and its associated value from the json jsonb - int removes the designated array element jsonb - text[] removes a key and associated value or array element at the designated path jsonb_replace(jsonb,text[],jsonb) replaces the array element designated by the path or the value associated with the key designated by the path with the given value. Original work by Dmitry Dolgov, adapted and reworked for PostgreSQL core by Andrew Dunstan, reviewed and tidied up by Petr Jelinek.	2015-05-12 15:52:45 -04:00
Alvaro Herrera	b488c580ae	Allow on-the-fly capture of DDL event details This feature lets user code inspect and take action on DDL events. Whenever a ddl_command_end event trigger is installed, DDL actions executed are saved to a list which can be inspected during execution of a function attached to ddl_command_end. The set-returning function pg_event_trigger_ddl_commands can be used to list actions so captured; it returns data about the type of command executed, as well as the affected object. This is sufficient for many uses of this feature. For the cases where it is not, we also provide a "command" column of a new pseudo-type pg_ddl_command, which is a pointer to a C structure that can be accessed by C code. The struct contains all the info necessary to completely inspect and even reconstruct the executed command. There is no actual deparse code here; that's expected to come later. What we have is enough infrastructure that the deparsing can be done in an external extension. The intention is that we will add some deparsing code in a later release, as an in-core extension. A new test module is included. It's probably insufficient as is, but it should be sufficient as a starting point for a more complete and future-proof approach. Authors: Álvaro Herrera, with some help from Andres Freund, Ian Barwick, Abhijit Menon-Sen. Reviews by Andres Freund, Robert Haas, Amit Kapila, Michael Paquier, Craig Ringer, David Steele. Additional input from Chris Browne, Dimitri Fontaine, Stephen Frost, Petr Jelínek, Tom Lane, Jim Nasby, Steven Singer, Pavel Stěhule. Based on original work by Dimitri Fontaine, though I didn't use his code. Discussion: https://www.postgresql.org/message-id/m2txrsdzxa.fsf@2ndQuadrant.fr https://www.postgresql.org/message-id/20131108153322.GU5809@eldon.alvh.no-ip.org https://www.postgresql.org/message-id/20150215044814.GL3391@alvh.no-ip.org	2015-05-11 19:14:31 -03:00
Andrew Dunstan	cb9fa802b3	Add new OID alias type regnamespace Catalog version bumped Kyotaro HORIGUCHI	2015-05-09 13:36:52 -04:00
Andrew Dunstan	0c90f6769d	Add new OID alias type regrole The new type has the scope of whole the database cluster so it doesn't behave the same as the existing OID alias types which have database scope, concerning object dependency. To avoid confusion constants of the new type are prohibited from appearing where dependencies are made involving it. Also, add a note to the docs about possible MVCC violation and optimization issues, which are general over the all reg* types. Kyotaro Horiguchi	2015-05-09 13:06:49 -04:00

1 2 3 4 5 ...

1799 commits