postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-05-28 04:35:45 -04:00

Author	SHA1	Message	Date
Alexander Korotkov	6df7a9698b	Multirange datatypes Multiranges are basically sorted arrays of non-overlapping ranges with set-theoretic operations defined over them. Since v14, each range type automatically gets a corresponding multirange datatype. There are both manual and automatic mechanisms for naming multirange types. Once can specify multirange type name using multirange_type_name attribute in CREATE TYPE. Otherwise, a multirange type name is generated automatically. If the range type name contains "range" then we change that to "multirange". Otherwise, we add "_multirange" to the end. Implementation of multiranges comes with a space-efficient internal representation format, which evades extra paddings and duplicated storage of oids. Altogether this format allows fetching a particular range by its index in O(n). Statistic gathering and selectivity estimation are implemented for multiranges. For this purpose, stored multirange is approximated as union range without gaps. This field will likely need improvements in the future. Catversion is bumped. Discussion: https://postgr.es/m/CALNJ-vSUpQ_Y%3DjXvTxt1VYFztaBSsWVXeF1y6gTYQ4bOiWDLgQ%40mail.gmail.com Discussion: https://postgr.es/m/a0b8026459d1e6167933be2104a6174e7d40d0ab.camel%40j-davis.com#fe7218c83b08068bfffb0c5293eceda0 Author: Paul Jungwirth, revised by me Reviewed-by: David Fetter, Corey Huinker, Jeff Davis, Pavel Stehule Reviewed-by: Alvaro Herrera, Tom Lane, Isaac Morland, David G. Johnston Reviewed-by: Zhihong Yu, Alexander Korotkov	2020-12-20 07:20:33 +03:00
Tom Lane	b3817f5f77	Improve hash_create()'s API for some added robustness. Invent a new flag bit HASH_STRINGS to specify C-string hashing, which was formerly the default; and add assertions insisting that exactly one of the bits HASH_STRINGS, HASH_BLOBS, and HASH_FUNCTION be set. This is in hopes of preventing recurrences of the type of oversight fixed in commit `a1b8aa1e4` (i.e., mistakenly omitting HASH_BLOBS). Also, when HASH_STRINGS is specified, insist that the keysize be more than 8 bytes. This is a heuristic, but it should catch accidental use of HASH_STRINGS for integer or pointer keys. (Nearly all existing use-cases set the keysize to NAMEDATALEN or more, so there's little reason to think this restriction should be problematic.) Tweak hash_create() to insist that the HASH_ELEM flag be set, and remove the defaults it had for keysize and entrysize. Since those defaults were undocumented and basically useless, no callers omitted HASH_ELEM anyway. Also, remove memset's zeroing the HASHCTL parameter struct from those callers that had one. This has never been really necessary, and while it wasn't a bad coding convention it was confusing that some callers did it and some did not. We might as well save a few cycles by standardizing on "not". Also improve the documentation for hash_create(). In passing, improve reinit.c's usage of a hash table by storing the key as a binary Oid rather than a string; and, since that's a temporary hash table, allocate it in CurrentMemoryContext for neatness. Discussion: https://postgr.es/m/590625.1607878171@sss.pgh.pa.us	2020-12-15 11:38:53 -05:00
Tom Lane	c7aba7c14e	Support subscripting of arbitrary types, not only arrays. This patch generalizes the subscripting infrastructure so that any data type can be subscripted, if it provides a handler function to define what that means. Traditional variable-length (varlena) arrays all use array_subscript_handler(), while the existing fixed-length types that support subscripting use raw_array_subscript_handler(). It's expected that other types that want to use subscripting notation will define their own handlers. (This patch provides no such new features, though; it only lays the foundation for them.) To do this, move the parser's semantic processing of subscripts (including coercion to whatever data type is required) into a method callback supplied by the handler. On the execution side, replace the ExecEvalSubscriptingRef* layer of functions with direct calls to callback-supplied execution routines. (Thus, essentially no new run-time overhead should be caused by this patch. Indeed, there is room to remove some overhead by supplying specialized execution routines. This patch does a little bit in that line, but more could be done.) Additional work is required here and there to remove formerly hard-wired assumptions about the result type, collation, etc of a SubscriptingRef expression node; and to remove assumptions that the subscript values must be integers. One useful side-effect of this is that we now have a less squishy mechanism for identifying whether a data type is a "true" array: instead of wiring in weird rules about typlen, we can look to see if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this to be bulletproof, we have to forbid user-defined types from using that handler directly; but there seems no good reason for them to do so. This patch also removes assumptions that the number of subscripts is limited to MAXDIM (6), or indeed has any hard-wired limit. That limit still applies to types handled by array_subscript_handler or raw_array_subscript_handler, but to discourage other dependencies on this constant, I've moved it from c.h to utils/array.h. Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov, Peter Eisentraut, Pavel Stehule Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com	2020-12-09 12:40:37 -05:00
Peter Eisentraut	8b069ef5dc	Change get_constraint_index() to use pg_constraint.conindid It was still using a scan of pg_depend instead of using the conindid column that has been added since. Since it is now just a catalog lookup wrapper and not related to pg_depend, move from pg_depend.c to lsyscache.c. Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/4688d55c-9a2e-9a5a-d166-5f24fe0bf8db%40enterprisedb.com	2020-12-09 15:41:45 +01:00
Michael Paquier	4f48a6fbe2	Change SHA2 implementation based on OpenSSL to use EVP digest routines The use of low-level hash routines is not recommended by upstream OpenSSL since 2000, and pgcrypto already switched to EVP as of `5ff4a67`. This takes advantage of the refactoring done in `87ae969` that has introduced the allocation and free routines for cryptographic hashes. Since 1.1.0, OpenSSL does not publish the contents of the cryptohash contexts, forcing any consumers to rely on OpenSSL for all allocations. Hence, the resource owner callback mechanism gains a new set of routines to track and free cryptohash contexts when using OpenSSL, preventing any risks of leaks in the backend. Nothing is needed in the frontend thanks to the refactoring of `87ae969`, and the resowner knowledge is isolated into cryptohash_openssl.c. Note that this also fixes a failure with SCRAM authentication when using FIPS in OpenSSL, but as there have been few complaints about this problem and as this causes an ABI breakage, no backpatch is done. Author: Michael Paquier Reviewed-by: Daniel Gustafsson, Heikki Linnakangas Discussion: https://postgr.es/m/20200924025314.GE7405@paquier.xyz Discussion: https://postgr.es/m/20180911030250.GA27115@paquier.xyz	2020-12-04 10:49:23 +09:00
Tom Lane	2432b1a040	Avoid spamming the client with multiple ParameterStatus messages. Up to now, we sent a ParameterStatus message to the client immediately upon any change in the active value of any GUC_REPORT variable. This was only barely okay when the feature was designed; now that we have things like function SET clauses, there are very plausible use-cases where a GUC_REPORT variable might change many times within a query --- and even end up back at its original value, perhaps. Fortunately most of our GUC_REPORT variables are unlikely to be changed often; but there are proposals in play to enlarge that set, or even make it user-configurable. Hence, let's fix things to not generate more than one ParameterStatus message per variable per query, and to not send any message at all unless the end-of-query value is different from what we last reported. Discussion: https://postgr.es/m/5708.1601145259@sss.pgh.pa.us	2020-11-25 11:40:44 -05:00
David Rowley	b0727ae99b	Tidy up definitions of pg_attribute_hot and pg_attribute_cold `1fa22a43a` was a quick fix for portability problem I introduced in `697e1d02f`. `1fa22a43a` adds a few more cases to the preprocessor logic than I'd have liked. Andres Freund and Dagfinn Ilmari Mannsåker suggested a better way to do this. In passing, also adjust the only current usage of these macros so that the macro comes before the function's return type in the declaration of the function. This now matches what the definition of the function does. Discussion: https://postgr.es/m/20200625163553.lt6wocbjhklp5pl4@alap3.anarazel.de Discussion: https://postgr.es/m/87pn43bmok.fsf@wibble.ilmari.org	2020-11-25 10:52:50 +13:00
Heikki Linnakangas	0a2bc5d61e	Move per-agg and per-trans duplicate finding to the planner. This has the advantage that the cost estimates for aggregates can count the number of calls to transition and final functions correctly. Bump catalog version, because views can contain Aggrefs. Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/b2e3536b-1dbc-8303-c97e-89cb0b4a9a48%40iki.fi	2020-11-24 10:45:00 +02:00
Tom Lane	789b938bf2	Centralize logic for skipping useless ereport/elog calls. While ereport() and elog() themselves are quite cheap when the error message level is too low to be printed, some places need to do substantial work before they can call those macros at all. To allow optimizing away such setup work when nothing is to be printed, make elog.c export a new function message_level_is_interesting(elevel) that reports whether ereport/elog will do anything. Make use of that in various places that had ad-hoc direct tests of log_min_messages etc. Also teach ProcSleep to use it to avoid some work. (There may well be other places that could usefully use this; I didn't search hard.) Within elog.c, refactor a little bit to avoid having duplicate copies of the policy-setting logic. When that code was written, we weren't relying on the availability of inline functions; so it had some duplications in the name of efficiency, which I got rid of. Alvaro Herrera and Tom Lane Discussion: https://postgr.es/m/129515.1606166429@sss.pgh.pa.us	2020-11-23 19:10:46 -05:00
David Rowley	913ec71d68	Improve compiler code layout in elog/ereport ERROR calls Here we use a bit of preprocessor trickery to coax supporting compilers into laying out their generated code so that the code that's in the same branch as elog(ERROR)/ereport(ERROR) calls is moved away from the hot path. Effectively, this reduces the size of the hot code meaning that it can sit on fewer cache lines. Performance improvements of between 10-15% have been seen on highly CPU bound workloads using pgbench's TPC-b benchmark. What's achieved here is very similar to putting the error condition inside an unlikely() macro. For example; if (unlikely(x < 0)) elog(ERROR, "invalid x value"); now there's no need to make use of unlikely() here as the common macro used by elog and ereport will now see that elevel is >= ERROR and make use of a pg_attribute_cold marked version of errstart(). When elevel < ERROR or if it cannot be determined to be constant, the original behavior is maintained. Author: David Rowley Reviewed-by: Andres Freund, Peter Eisentraut Discussion: https://postgr.es/m/CAApHDvrVpasrEzLL2er7p9iwZFZ%3DJj6WisePcFeunwfrV0js_A%40mail.gmail.com	2020-11-24 12:04:42 +13:00
Tom Lane	8597a48d01	Fix FPeq() and friends to get the right answers for infinities. "FPeq(infinity, infinity)" returned false, on account of getting NaN when it subtracts the two inputs. Fix that by adding a separate check for exact equality. FPle() and FPge() similarly got the wrong answer for two like-signed infinities. In those cases, we can just rearrange the comparisons to avoid potentially subtracting infinities. While the sibling functions FPne() etc accidentally gave the right answers even with the internal NaN results, it seems best to make similar adjustments to them to avoid depending on this. FPeq() has to be converted to an inline function to avoid double evaluations of its arguments, and I did the same for the others just for consistency. In passing, make the handling of NaN cases in line_eq() and point_eq_point() simpler and easier to reason about, and perhaps faster. This results in just one visible regression test change: slope() now gives DBL_MAX for two inputs of (inf,1e300), which is consistent with what it does for (1e300,inf), so that seems like a bug fix. Discussion: https://postgr.es/m/CAGf+fX70rWFOk5cd00uMfa__0yP+vtQg5ck7c2Onb-Yczp0URA@mail.gmail.com	2020-11-21 16:46:43 -05:00
Peter Eisentraut	a378ba49a5	Add pg_nodiscard decorations to some functions Especially for the list API such as lappend() forgetting to assign the return value is a common problem. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/e3753562-99cd-b65f-5aca-687dfd1ec2fc@2ndquadrant.com	2020-11-11 11:00:27 +01:00
Tom Lane	ec29427ce2	Fix and simplify some usages of TimestampDifference(). Introduce TimestampDifferenceMilliseconds() to simplify callers that would rather have the difference in milliseconds, instead of the select()-oriented seconds-and-microseconds format. This gets rid of at least one integer division per call, and it eliminates some apparently-easy-to-mess-up arithmetic. Two of these call sites were in fact wrong: * pg_prewarm's autoprewarm_main() forgot to multiply the seconds by 1000, thus ending up with a delay 1000X shorter than intended. That doesn't quite make it a busy-wait, but close. * postgres_fdw's pgfdw_get_cleanup_result() thought it needed to compute microseconds not milliseconds, thus ending up with a delay 1000X longer than intended. Somebody along the way had noticed this problem but misdiagnosed the cause, and imposed an ad-hoc 60-second limit rather than fixing the units. This was relatively harmless in context, because we don't care that much about exactly how long this delay is; still, it's wrong. There are a few more callers of TimestampDifference() that don't have a direct need for seconds-and-microseconds, but can't use TimestampDifferenceMilliseconds() either because they do need microsecond precision or because they might possibly deal with intervals long enough to overflow 32-bit milliseconds. It might be worth inventing another API to improve that, but that seems outside the scope of this patch; so those callers are untouched here. Given the fact that we are fixing some bugs, and the likelihood that future patches might want to back-patch code that uses this new API, back-patch to all supported branches. Alexey Kondratov and Tom Lane Discussion: https://postgr.es/m/3b1c053a21c07c1ed5e00be3b2b855ef@postgrespro.ru	2020-11-10 22:51:54 -05:00
Tom Lane	fac83dbd6f	Remove underflow error in float division with infinite divisor. float4_div and float8_div correctly produced zero for zero divided by infinity, but threw an underflow error for nonzero finite values divided by infinity. This seems wrong; at the very least it's inconsistent with the behavior recently implemented for numeric infinities. Remove the error and allow zero to be returned. This patch also removes a useless isinf() test from the overflow checks in these functions (non-Inf divided by Inf can't produce Inf). Extracted from a larger patch; this seems significant outside the context of geometric operators, so it deserves its own commit. Kyotaro Horiguchi Discussion: https://postgr.es/m/CAGf+fX70rWFOk5cd00uMfa__0yP+vtQg5ck7c2Onb-Yczp0URA@mail.gmail.com	2020-11-04 18:11:15 -05:00
Thomas Munro	257836a755	Track collation versions for indexes. Record the current version of dependent collations in pg_depend when creating or rebuilding an index. When accessing the index later, warn that the index may be corrupted if the current version doesn't match. Thanks to Douglas Doole, Peter Eisentraut, Christoph Berg, Laurenz Albe, Michael Paquier, Robert Haas, Tom Lane and others for very helpful discussion. Author: Thomas Munro <thomas.munro@gmail.com> Author: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> (earlier versions) Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com	2020-11-03 01:19:50 +13:00
Tom Lane	3db322eaab	Prevent internal overflows in date-vs-timestamp and related comparisons. The date-vs-timestamp, date-vs-timestamptz, and timestamp-vs-timestamptz comparators all worked by promoting the first type to the second and then doing a simple same-type comparison. This works fine, except when the conversion result is out of range, in which case we throw an entirely avoidable error. The sources of such failures are (a) type date can represent dates much farther in the future than the timestamp types can; (b) timezone rotation might cause a just-in-range timestamp value to become a just-out-of-range timestamptz value. Up to now we just ignored these corner-case issues, but now we have an actual user complaint (bug #16657 from Huss EL-Sheikh), so let's do something about it. It turns out that commit `52ad1e659` already built all the necessary infrastructure to support error-free comparisons, but neglected to actually use it in the main-line code paths. Fix that, do a little bit of code style review, and remove the now-duplicate logic in jsonpath_exec.c. Back-patch to v13 where `52ad1e659` came in. We could take this back further by back-patching said infrastructure, but given the small number of complaints so far, I don't feel a great need to. Discussion: https://postgr.es/m/16657-cde2f876d8cc7971@postgresql.org	2020-10-07 17:10:26 -04:00
Tom Lane	97b6144826	Make postgres.bki use the same literal-string syntax as postgresql.conf. The BKI file's string quoting conventions were previously quite weird, perhaps as a result of repurposing a function built to scan single-quoted strings to scan double-quoted ones. Change to use the same rules as we use in GUC files, allowing some simplifications in genbki.pl and initdb.c. While at it, completely remove the backend's scanstr() function, which was essentially a duplicate of the string dequoting code in guc-file.l. Instead export that one (under a less generic name than it had) and let bootscanner.l use it. Now we can clarify that scansup.c exists only to support the main lexer. We could alternatively have removed GUC_scanstr, but this way seems better since the previous arrangement could mislead a reader into thinking that scanstr() had something to do with the main lexer's handling of string literals. Maybe it did once, but if so it was a long time ago. This patch does not bump catversion, since the initially-installed catalog contents don't change. Note however that successful initdb after applying this patch will require up-to-date postgres.bki as well as postgres and initdb executables. In passing, remove a bunch of very-long-obsolete #include's in bootparse.y and bootscanner.l. John Naylor Discussion: https://postgr.es/m/CACPNZCtDpd18T0KATTmCggO2GdVC4ow86ypiq5ENff1VnauL8g@mail.gmail.com	2020-10-04 16:09:55 -04:00
Robert Haas	f5ea92e8d6	Expose oldSnapshotControl definition via new header. This makes it possible for code outside snapmgr.c to examine the contents of this data structure. This commit does not add any code which actually does so; a subsequent commit will make that change. Patch by me, reviewed by Thomas Munro, Dilip Kumar, Hamid Akhtar. Discussion: http://postgr.es/m/CA+TgmoY=aqf0zjTD+3dUWYkgMiNDegDLFjo+6ze=Wtpik+3XqA@mail.gmail.com	2020-09-24 13:33:09 -04:00
Thomas Munro	be0a666665	Remove large fill factor support from dynahash.c. Since ancient times we have had support for a fill factor (maximum load factor) to be set for a dynahash hash table, but: 1. It was an integer, whereas for in-memory hash tables interesting load factor targets are probably somewhere near the 0.75-1.0 range. 2. It was implemented in a way that performed an expensive division operation that regularly showed up in profiles. 3. We are not aware of anyone ever having used a non-default value. Therefore, remove support, effectively fixing it at 1. Author: Jakub Wartak <Jakub.Wartak@tomtom.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com	2020-09-19 11:40:39 +12:00
Heikki Linnakangas	16fa9b2b30	Add support for building GiST index by sorting. This adds a new optional support function to the GiST access method: sortsupport. If it is defined, the GiST index is built by sorting all data to the order defined by the sortsupport's comparator function, and packing the tuples in that order to GiST pages. This is similar to how B-tree index build works, and is much faster than inserting the tuples one by one. The resulting index is smaller too, because the pages are packed more tightly, upto 'fillfactor'. The normal build method works by splitting pages, which tends to lead to more wasted space. The quality of the resulting index depends on how good the opclass-defined sort order is. A good order preserves locality of the input data. As the first user of this facility, add 'sortsupport' function to the point_ops opclass. It sorts the points in Z-order (aka Morton Code), by interleaving the bits of the X and Y coordinates. Author: Andrey Borodin Reviewed-by: Pavel Borisov, Thomas Munro Discussion: https://www.postgresql.org/message-id/1A36620E-CAD8-4267-9067-FB31385E7C0D%40yandex-team.ru	2020-09-17 11:33:40 +03:00
Jeff Davis	0758964963	logtape.c: do not preallocate for tapes when sorting The preallocation logic is only useful for HashAgg, so disable it when sorting. Also, adjust an out-of-date comment. Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wzn_o7tE2+hRVvwSFghRb75AJ5g-nqGzDUqLYMexjOAe=g@mail.gmail.com Backpatch-through: 13	2020-09-11 17:10:02 -07:00
Peter Eisentraut	0aa8f76408	Expose internal function for converting int64 to numeric Existing callers had to take complicated detours via DirectFunctionCall1(). This simplifies a lot of code. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu	2020-09-09 20:16:28 +02:00
Andres Freund	623a9ba79b	snapshot scalability: cache snapshots using a xact completion counter. Previous commits made it faster/more scalable to compute snapshots. But not building a snapshot is still faster. Now that GetSnapshotData() does not maintain RecentGlobal* anymore, that is actually not too hard: This commit introduces xactCompletionCount, which tracks the number of top-level transactions with xids (i.e. which may have modified the database) that completed in some form since the start of the server. We can avoid rebuilding the snapshot's contents whenever the current xactCompletionCount is the same as it was when the snapshot was originally built. Currently this check happens while holding ProcArrayLock. While it's likely possible to perform the check without acquiring ProcArrayLock, it seems better to do that separately / later, some careful analysis is required. Even with the lock this is a significant win on its own. On a smaller two socket machine this gains another ~1.03x, on a larger machine the effect is roughly double (earlier patch version tested though). If we were able to safely avoid the lock there'd be another significant gain on top of that. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de	2020-08-17 21:08:30 -07:00
Andres Freund	dc7420c2c9	snapshot scalability: Don't compute global horizons while building snapshots. To make GetSnapshotData() more scalable, it cannot not look at at each proc's xmin: While snapshot contents do not need to change whenever a read-only transaction commits or a snapshot is released, a proc's xmin is modified in those cases. The frequency of xmin modifications leads to, particularly on higher core count systems, many cache misses inside GetSnapshotData(), despite the data underlying a snapshot not changing. That is the most significant source of GetSnapshotData() scaling poorly on larger systems. Without accessing xmins, GetSnapshotData() cannot calculate accurate horizons / thresholds as it has so far. But we don't really have to: The horizons don't actually change that much between GetSnapshotData() calls. Nor are the horizons actually used every time a snapshot is built. The trick this commit introduces is to delay computation of accurate horizons until there use and using horizon boundaries to determine whether accurate horizons need to be computed. The use of RecentGlobal[Data]Xmin to decide whether a row version could be removed has been replaces with new GlobalVisTest* functions. These use two thresholds to determine whether a row can be pruned: 1) definitely_needed, indicating that rows deleted by XIDs >= definitely_needed are definitely still visible. 2) maybe_needed, indicating that rows deleted by XIDs < maybe_needed can definitely be removed GetSnapshotData() updates definitely_needed to be the xmin of the computed snapshot. When testing whether a row can be removed (with GlobalVisTestIsRemovableXid()) and the tested XID falls in between the two (i.e. XID >= maybe_needed && XID < definitely_needed) the boundaries can be recomputed to be more accurate. As it is not cheap to compute accurate boundaries, we limit the number of times that happens in short succession. As the boundaries used by GlobalVisTestIsRemovableXid() are never reset (with maybe_needed updated by GetSnapshotData()), it is likely that further test can benefit from an earlier computation of accurate horizons. To avoid regressing performance when old_snapshot_threshold is set (as that requires an accurate horizon to be computed), heap_page_prune_opt() doesn't unconditionally call TransactionIdLimitedForOldSnapshots() anymore. Both the computation of the limited horizon, and the triggering of errors (with SetOldSnapshotThresholdTimestamp()) is now only done when necessary to remove tuples. This commit just removes the accesses to PGXACT->xmin from GetSnapshotData(), but other members of PGXACT residing in the same cache line are accessed. Therefore this in itself does not result in a significant improvement. Subsequent commits will take advantage of the fact that GetSnapshotData() now does not need to access xmins anymore. Note: This contains a workaround in heap_page_prune_opt() to keep the snapshot_too_old tests working. While that workaround is ugly, the tests currently are not meaningful, and it seems best to address them separately. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de	2020-08-12 16:03:49 -07:00
Peter Eisentraut	1784f278a6	Replace remaining StrNCpy() by strlcpy() They are equivalent, except that StrNCpy() zero-fills the entire destination buffer instead of providing just one trailing zero. For all but a tiny number of callers, that's just overhead rather than being desirable. Remove StrNCpy() as it is now unused. In some cases, namestrcpy() is the more appropriate function to use. While we're here, simplify the API of namestrcpy(): Remove the return value, don't check for NULL input. Nothing was using that anyway. Also, remove a few unused name-related functions. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/44f5e198-36f6-6cdb-7fa9-60e34784daae%402ndquadrant.com	2020-08-10 23:20:37 +02:00
David Rowley	6ee3b5fb99	Use int64 instead of long in incremental sort code Windows 64bit has 4-byte long values which is not suitable for tracking disk space usage in the incremental sort code. Let's just make all these fields int64s. Author: James Coleman Discussion: https://postgr.es/m/CAApHDvpky%2BUhof8mryPf5i%3D6e6fib2dxHqBrhp0Qhu0NeBhLJw%40mail.gmail.com Backpatch-through: 13, where the incremental sort code was added	2020-08-02 14:24:46 +12:00
Amit Kapila	c55040ccd0	WAL Log invalidations at command end with wal_level=logical. When wal_level=logical, write invalidations at command end into WAL so that decoding can use this information. This patch is required to allow the streaming of in-progress transactions in logical decoding. The actual work to allow streaming will be committed as a separate patch. We still add the invalidations to the cache and write them to WAL at commit time in RecordTransactionCommit(). This uses the existing XLOG_INVALIDATIONS xlog record type, from the RM_STANDBY_ID resource manager (see LogStandbyInvalidations for details). So existing code relying on those invalidations (e.g. redo) does not need to be changed. The invalidations written at command end uses a new xlog record type XLOG_XACT_INVALIDATIONS, from RM_XACT_ID resource manager. See LogLogicalInvalidations for details. These new xlog records are ignored by existing redo procedures, which still rely on the invalidations written to commit records. The invalidations are decoded and accumulated in top-transaction, and then executed during replay. This obviates the need to decode the invalidations as part of a commit record. Bump XLOG_PAGE_MAGIC, since this introduces XLOG_XACT_INVALIDATIONS. Author: Dilip Kumar, Tomas Vondra, Amit Kapila Reviewed-by: Amit Kapila Tested-by: Neha Sharma and Mahendra Singh Thalor Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com	2020-07-23 08:34:48 +05:30
Tom Lane	a57d312a77	Support infinity and -infinity in the numeric data type. Add infinities that behave the same as they do in the floating-point data types. Aside from any intrinsic usefulness these may have, this closes an important gap in our ability to convert floating values to numeric and/or replace float-based APIs with numeric. The new values are represented by bit patterns that were formerly not used (although old code probably would take them for NaNs). So there shouldn't be any pg_upgrade hazard. Patch by me, reviewed by Dean Rasheed and Andrew Gierth Discussion: https://postgr.es/m/606717.1591924582@sss.pgh.pa.us	2020-07-22 19:19:44 -04:00
Tom Lane	4fb6aeb4f6	Make floating-point "NaN / 0" return NaN instead of raising an error. This is more consistent with the IEEE 754 spec and our treatment of NaNs elsewhere; in particular, the case has always acted that way in "numeric" arithmetic. Noted by Dean Rasheed. Discussion: https://postgr.es/m/3421746.1594927785@sss.pgh.pa.us	2020-07-20 19:44:45 -04:00
Fujii Masao	d05b172a76	Add generic_plans and custom_plans fields into pg_prepared_statements. There was no easy way to find how many times generic and custom plans have been executed for a prepared statement. This commit exposes those numbers of times in pg_prepared_statements view. Author: Atsushi Torikoshi, Kyotaro Horiguchi Reviewed-by: Tatsuro Yamada, Masahiro Ikeda, Fujii Masao Discussion: https://postgr.es/m/CACZ0uYHZ4M=NZpofH6JuPHeX=__5xcDELF8hT8_2T+R55w4RQw@mail.gmail.com	2020-07-20 11:55:50 +09:00
Michael Paquier	2a10fdc430	Eliminate cache lookup errors in SQL functions for object addresses When using the following functions, users could see various types of errors of the type "cache lookup failed for OID XXX" with elog(), that can only be used for internal errors: * pg_describe_object() * pg_identify_object() * pg_identify_object_as_address() The set of APIs managing object addresses for all object types are made smarter by gaining a new argument "missing_ok" that allows any caller to control if an error is raised or not on an undefined object. The SQL functions listed above are changed to handle the case where an object is missing. Regression tests are added for all object types for the cases where these are undefined. Before this commit, these cases failed with cache lookup errors, and now they basically return NULL (minus the name of the object type requested). Author: Michael Paquier Reviewed-by: Aleksander Alekseev, Dmitry Dolgov, Daniel Gustafsson, Álvaro Herrera, Kyotaro Horiguchi Discussion: https://postgr.es/m/CAB7nPqSZxrSmdHK-rny7z8mi=EAFXJ5J-0RbzDw6aus=wB5azQ@mail.gmail.com	2020-07-15 09:03:10 +09:00
Michael Paquier	b8401c32ba	Fix some header identifications The following header files missed the shot: - jsonfuncs.h, as of `ce0425b`. - jsonapi.h, as of `beb4699`. - llvmjit_emit.h as of `7ec0d80`. - partdesc.h, as of `1bb5e78`. Author: Jesse Zhang Discussion: https://postgr.es/m/CAGf+fX4-8xULEOz09DE2dZGjT+q8VJ--rqfTpvcFwc+A4fc-3Q@mail.gmail.com	2020-07-14 13:39:45 +09:00
Andres Freund	a9a4a7ad56	code: replace most remaining uses of 'master'. Author: Andres Freund Reviewed-By: David Steele Discussion: https://postgr.es/m/20200615182235.x7lch5n6kcjq4aue@alap3.anarazel.de	2020-07-08 13:24:35 -07:00
Andres Freund	5e7bbb5286	code: replace 'master' with 'primary' where appropriate. Also changed "in the primary" to "on the primary", and added a few "the" before "primary". Author: Andres Freund Reviewed-By: David Steele Discussion: https://postgr.es/m/20200615182235.x7lch5n6kcjq4aue@alap3.anarazel.de	2020-07-08 12:57:23 -07:00
Michael Paquier	aa38434824	Refactor routines for name lookups of procedures and operators This introduces a new set of extended routines for procedure and operator name lookups, with a flag bitmask argument that can modify the result. The following options are available: - Force schema qualification, ignoring search_path. This is similar to the existing option for format_{operator\|procedure}_qualified(). - Force NULL as result instead of a numeric OID for an undefined object. This option is new. This is a refactoring similar to `1185c78`, that will be used for a future patch to improve the SQL functions providing information using object addresses for undefined objects. Author: Michael Paquier Reviewed-by: Aleksander Alekseev, Dmitry Dolgov, Daniel Gustafsson, Álvaro Herrera Discussion: https://postgr.es/m/CAB7nPqSZxrSmdHK-rny7z8mi=EAFXJ5J-0RbzDw6aus=wB5azQ@mail.gmail.com	2020-07-06 13:06:08 +09:00
Michael Paquier	1185c78294	Add new flag to format_type_extended() to get NULL for undefined type If a type scanned is undefined, type format routines have two behaviors depending on if FORMAT_TYPE_ALLOW_INVALID is used by the caller or not: - Issue a cache lookup error - Return an undefined type name "???", "???[]" or "-" The current interface is not really helpful for callers willing to format properly a type name, but still make sure that the type is defined as there could be types matching the strings generated when looking for an undefined type, even if that should not be a problem in practice. In order to counter that, add a new flag called FORMAT_TYPE_INVALID_AS_NULL that returns a NULL result instead of "??? or "-" which does not generate an error. This flag will be used in a follow-up patch improving the set of SQL functions showing information for object addresses when it comes to undefined objects. Author: Michael Paquier Reviewed-by: Aleksander Alekseev, Dmitry Dolgov, Daniel Gustafsson, Álvaro Herrera Discussion: https://postgr.es/m/CAB7nPqSZxrSmdHK-rny7z8mi=EAFXJ5J-0RbzDw6aus=wB5azQ@mail.gmail.com	2020-07-06 12:12:11 +09:00
David Rowley	dad75eb4a8	Have pg_itoa, pg_ltoa and pg_lltoa return the length of the string Core by no means makes excessive use of these functions, but quite a large number of those usages do require the caller to call strlen() on the returned string. This is quite wasteful since these functions do already have a good idea of the length of the string, so we might as well just have them return that. Reviewed-by: Andrew Gierth Discussion: https://postgr.es/m/CAApHDvrm2A5x2uHYxsqriO2cUaGcFvND%2BksC9e7Tjep0t2RK_A%40mail.gmail.com	2020-06-13 12:32:00 +12:00
David Rowley	9a7fccd9ea	Add missing extern keyword for a couple of numutils functions In passing, also remove a few surplus empty lines from pg_ltoa and pg_ulltoa_n in numutils.c Reported-by: Andrew Gierth Discussion: https://postgr.es/m/87y2ou3xuh.fsf@news-spur.riddles.org.uk Backpatch-through: 13, where these changes were introduced	2020-06-13 11:27:25 +12:00
Peter Eisentraut	b1d32d3e32	Unify drop-by-OID functions There are a number of Remove${Something}ById() functions that are essentially identical in structure and only different in which catalog they are working on. Refactor this to be one generic function. The information about which oid column, index, etc. to use was already available in ObjectProperty for most catalogs, in a few cases it was easily added. Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/331d9661-1743-857f-1cbb-d5728bcd62cb%402ndquadrant.com	2020-06-09 09:39:46 +02:00
Tom Lane	0c882e52a8	Improve ineq_histogram_selectivity's behavior for non-default orderings. ineq_histogram_selectivity() can be invoked in situations where the ordering we care about is not that of the column's histogram. We could be considering some other collation, or even more drastically, the query operator might not agree at all with what was used to construct the histogram. (We'll get here for anything using scalarineqsel-based estimators, so that's quite likely to happen for extension operators.) Up to now we just ignored this issue and assumed we were dealing with an operator/collation whose sort order exactly matches the histogram, possibly resulting in junk estimates if the binary search gets confused. It's past time to improve that, since the use of nondefault collations is increasing. What we can do is verify that the given operator and collation match what's recorded in pg_statistic, and use the existing code only if so. When they don't match, instead execute the operator against each histogram entry, and take the fraction of successes as our selectivity estimate. This gives an estimate that is probably good to about 1/histogram_size, with no assumptions about ordering. (The quality of the estimate is likely to degrade near the ends of the value range, since the two orderings probably don't agree on what is an extremal value; but this is surely going to be more reliable than what we did before.) At some point we might further improve matters by storing more than one histogram calculated according to different orderings. But this code would still be good fallback logic when no matches exist, so that is not an argument for not doing this. While here, also improve get_variable_range() to deal more honestly with non-default collations. This isn't back-patchable, because it requires adding another argument to ineq_histogram_selectivity, and because it might have significant impact on the estimation results for extension operators relying on scalarineqsel --- mostly for the better, one hopes, but in any case destabilizing plan choices in back branches is best avoided. Per investigation of a report from James Lucas. Discussion: https://postgr.es/m/CAAFmbbOvfi=wMM=3qRsPunBSLb8BFREno2oOzSBS=mzfLPKABw@mail.gmail.com	2020-06-05 16:55:27 -04:00
Tom Lane	044c99bc56	Use query collation, not column's collation, while examining statistics. Commit `5e0928005` changed the planner so that, instead of blindly using DEFAULT_COLLATION_OID when invoking operators for selectivity estimation, it would use the collation of the column whose statistics we're considering. This was recognized as still being not quite the right thing, but it seemed like a good incremental improvement. However, shortly thereafter we introduced nondeterministic collations, and that creates cases where operators can fail if they're passed the wrong collation. We don't want planning to fail in cases where the query itself would work, so this means that we must use the query's collation when invoking operators for estimation purposes. The only real problem this creates is in ineq_histogram_selectivity, where the binary search might produce a garbage answer if we perform comparisons using a different collation than the column's histogram is ordered with. However, when the query's collation is significantly different from the column's default collation, the estimate we previously generated would be pretty irrelevant anyway; so it's not clear that this will result in noticeably worse estimates in practice. (A follow-on patch will improve this situation in HEAD, but it seems too invasive for back-patch.) The patch requires changing the signatures of mcv_selectivity and allied functions, which are exported and very possibly are used by extensions. In HEAD, I just did that, but an API/ABI break of this sort isn't acceptable in stable branches. Therefore, in v12 the patch introduces "mcv_selectivity_ext" and so on, with signatures matching HEAD, and makes the old functions into wrappers that assume DEFAULT_COLLATION_OID should be used. That does not match the prior behavior, but it should avoid risk of failure in most cases. (In practice, I think most extension datatypes aren't collation-aware, so the change probably doesn't matter to them.) Per report from James Lucas. Back-patch to v12 where the problem was introduced. Discussion: https://postgr.es/m/CAAFmbbOvfi=wMM=3qRsPunBSLb8BFREno2oOzSBS=mzfLPKABw@mail.gmail.com	2020-06-05 16:18:50 -04:00
Tom Lane	a9632830bb	Reject "23:59:60.nnn" in datetime input. It's intentional that we don't allow values greater than 24 hours, while we do allow "24:00:00" as well as "23:59:60" as inputs. However, the range check was miscoded in such a way that it would accept "23:59:60.nnn" with a nonzero fraction. For time or timetz, the stored result would then be greater than "24:00:00" which would fail dump/reload, not to mention possibly confusing other operations. Fix by explicitly calculating the result and making sure it does not exceed 24 hours. (This calculation is redundant with what will happen later in tm2time or tm2timetz. Maybe someday somebody will find that annoying enough to justify refactoring to avoid the duplication; but that seems too invasive for a back-patched bug fix, and the cost is probably unmeasurable anyway.) Note that this change also rejects such input as the time portion of a timestamp(tz) value. Back-patch to v10. The bug is far older, but to change this pre-v10 we'd need to ensure that the logic behaves sanely with float timestamps, which is possibly nontrivial due to roundoff considerations. Doesn't really seem worth troubling with. Per report from Christoph Berg. Discussion: https://postgr.es/m/20200520125807.GB296739@msg.df7cb.de	2020-06-04 16:42:23 -04:00
Tom Lane	5cbfce562f	Initial pgindent and pgperltidy run for v13. Includes some manual cleanup of places that pgindent messed up, most of which weren't per project style anyway. Notably, it seems some people didn't absorb the style rules of commit `c9d297751`, because there were a bunch of new occurrences of function calls with a newline just after the left paren, all with faulty expectations about how the rest of the call would get indented.	2020-05-14 13:06:50 -04:00
Tom Lane	9d25e1aa31	Clean up cpluspluscheck violation. "operator" is a reserved word in C++, so per project conventions, don't use it as an identifier in header files. My oversight in commit `a80818605`.	2020-04-21 11:21:15 -04:00
Michael Paquier	8128b0c152	Fix collection of typos and grammar mistakes in the tree, volume 2 This fixes some comments and documentation new as of Postgres 13, and is a follow-up of the work done in `dd0f37e`. Author: Justin Pryzby Discussion: https://postgr.es/m/20200408165653.GF2228@telsasoft.com	2020-04-14 14:45:43 +09:00
Alexander Korotkov	1aac32df89	Revert `0f5ca02f53` `0f5ca02f53` introduces 3 new keywords. It appears to be too much for relatively small feature. Given now we past feature freeze, it's already late for discussion of the new syntax. So, revert. Discussion: https://postgr.es/m/28209.1586294824%40sss.pgh.pa.us	2020-04-08 11:37:27 +03:00
Alexander Korotkov	0f5ca02f53	Implement waiting for given lsn at transaction start This commit adds following optional clause to BEGIN and START TRANSACTION commands. WAIT FOR LSN lsn [ TIMEOUT timeout ] New clause pospones transaction start till given lsn is applied on standby. This clause allows user be sure, that changes previously made on primary would be visible on standby. New shared memory struct is used to track awaited lsn per backend. Recovery process wakes up backend once required lsn is applied. Author: Ivan Kartyshov, Anna Akenteva Reviewed-by: Craig Ringer, Thomas Munro, Robert Haas, Kyotaro Horiguchi Reviewed-by: Masahiko Sawada, Ants Aasma, Dmitry Ivanov, Simon Riggs Reviewed-by: Amit Kapila, Alexander Korotkov Discussion: https://postgr.es/m/0240c26c-9f84-30ea-fca9-93ab2df5f305%40postgrespro.ru	2020-04-07 23:51:10 +03:00
Tom Lane	26a944cf29	Adjust bytea get_bit/set_bit to use int8 not int4 for bit numbering. Since the existing bit number argument can't exceed INT32_MAX, it's not possible for these functions to manipulate bits beyond the first 256MB of a bytea value. Lift that restriction by redeclaring the bit number arguments as int8 (which requires a catversion bump, hence is not back-patchable). The similarly-named functions for bit/varbit don't really have a problem because we restrict those types to at most VARBITMAXLEN bits; hence leave them alone. While here, extend the encode/decode functions in utils/adt/encode.c to allow dealing with values wider than 1GB. This is not a live bug or restriction in current usage, because no input could be more than 1GB, and since none of the encoders can expand a string more than 4X, the result size couldn't overflow uint32. But it might be desirable to support more in future, so make the input length values size_t and the potential-output-length values uint64. Also add some test cases to improve the miserable code coverage of these functions. Movead Li, editorialized some by me; also reviewed by Ashutosh Bapat Discussion: https://postgr.es/m/20200312115135445367128@highgo.ca	2020-04-07 15:57:58 -04:00
Tom Lane	c7654f6a37	Fix representation of SORT_TYPE_STILL_IN_PROGRESS. It turns out that the code did indeed rely on a zeroed TuplesortInstrumentation.sortMethod field to indicate "this worker never did anything", although it seems the issue only comes up during certain race-condition-y cases. Hence, rearrange the TuplesortMethod enum to restore SORT_TYPE_STILL_IN_PROGRESS to having the value zero, and add some comments reinforcing that that isn't optional. Also future-proof a loop over the possible values of the enum. sizeof(bits32) happened to be the correct limit value, but only by purest coincidence. Per buildfarm and local investigation. Discussion: https://postgr.es/m/12222.1586223974@sss.pgh.pa.us	2020-04-06 22:22:13 -04:00
Thomas Munro	aeec457de8	Add SQL type xid8 to expose FullTransactionId to users. Similar to xid, but 64 bits wide. This new type is suitable for use in various system views and administration functions. Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Takao Fujii <btfujiitkp@oss.nttdata.com> Reviewed-by: Yoshikazu Imai <imai.yoshikazu@fujitsu.com> Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://postgr.es/m/20190725000636.666m5mad25wfbrri%40alap3.anarazel.de	2020-04-07 12:03:59 +12:00

1 2 3 4 5 ...

2258 commits