Commit graph

29 commits

Author SHA1 Message Date
Nathan Bossart
f894acb24a Show size of DSAs and dshashes in pg_dsm_registry_allocations.
Presently, this view reports NULL for the size of DSAs and dshash
tables because 1) the current backend might not be attached to them
and 2) the registry doesn't save the pointers to the dsa_area or
dshash_table in local memory.  Also, the view doesn't show
partially-initialized entries to avoid ambiguity, since those
entries would report a NULL size as well.

This commit introduces a function that looks up the size of a DSA
given its handle (transiently attaching to the control segment if
needed) and teaches pg_dsm_registry_allocations to use it to show
the size of successfully-initialized DSA and dshash entries.
Furthermore, the view now reports partially-initialized entries
with a NULL size.

Reviewed-by: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aSeEDeznAsHR1_YF%40nathan
2025-12-02 10:29:45 -06:00
Nathan Bossart
c6abf24ebf Fix misspelling of "tranche" in dsa.h.
Oversight in commit bb952c8c8b.

Discussion: https://postgr.es/m/aKOWzsCPgrsoEG1Q%40nathan
2025-08-19 10:43:15 -05:00
Nathan Bossart
fe07100e82 Add GetNamedDSA() and GetNamedDSHash().
Presently, the dynamic shared memory (DSM) registry only provides
GetNamedDSMSegment(), which allocates a fixed-size segment.  To use
the DSM registry for more sophisticated things like dynamic shared
memory areas (DSAs) or a hash table backed by a DSA (dshash), users
need to create a DSM segment that stores various handles and LWLock
tranche IDs and to write fairly complicated initialization code.
Furthermore, there is likely little variation in this
initialization code between libraries.

This commit introduces functions that simplify allocating a DSA or
dshash within the DSM registry.  These functions are very similar
to GetNamedDSMSegment().  Notable differences include the lack of
an initialization callback parameter and the prohibition of calling
the functions more than once for a given entry in each backend
(which should be trivially avoidable in most circumstances).  While
at it, this commit bumps the maximum DSM registry entry name length
from 63 bytes to 127 bytes.

Also note that even though one could presumably detach/destroy the
DSAs and dshashes created in the registry, such use-cases are not
yet well-supported, if for no other reason than the associated DSM
registry entries cannot be removed.  Adding such support is left as
a future exercise.

The test_dsm_registry test module contains tests for the new
functions and also serves as a complete usage example.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Florents Tselai <florents.tselai@gmail.com>
Reviewed-by: Rahila Syed <rahilasyed90@gmail.com>
Discussion: https://postgr.es/m/aEC8HGy2tRQjZg_8%40nathan
2025-07-02 11:50:52 -05:00
Tom Lane
041e8b95b8 Get rid of our dependency on type "long" for memory size calculations.
Consistently use "Size" (or size_t, or in some places int64 or double)
as the type for variables holding memory allocation sizes.  In most
places variables' data types were fine already, but we had an ancient
habit of computing bytes from kilobytes-units GUCs with code like
"work_mem * 1024L".  That risks overflow on Win64 where they did not
make "long" as wide as "size_t".  We worked around that by restricting
such GUCs' ranges, so you couldn't set work_mem et al higher than 2GB
on Win64.  This patch removes that restriction, after replacing such
calculations with "work_mem * (Size) 1024" or variants of that.

It should be noted that this patch was constructed by searching
outwards from the GUCs that have MAX_KILOBYTES as upper limit.
So I can't positively guarantee there are no other places doing
memory-size arithmetic in int or long variables.  I do however feel
pretty confident that increasing MAX_KILOBYTES on Win64 is safe now.
Also, nothing in our code should be dealing in multiple-gigabyte
allocations without authorization from a relevant GUC, so it seems
pretty likely that this search caught everything that could be at
risk of overflow.

Author: Vladlen Popolitov <v.popolitov@postgrespro.ru>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1a01f0-66ec2d80-3b-68487680@27595217
2025-01-31 13:52:40 -05:00
Bruce Momjian
50e6eb731d Update copyright for 2025
Backpatch-through: 13
2025-01-01 11:21:55 -05:00
Thomas Munro
962da900ac Use <stdint.h> and <inttypes.h> for c.h integers.
Redefine our exact width types with standard C99 types and macros,
including int64_t, INT64_MAX, INT64_C(), PRId64 etc.  We were already
using <stdint.h> types in a few places.

One complication is that Windows' <inttypes.h> uses format strings like
"%I64d", "%I32", "%I" for PRI*64, PRI*32, PTR*PTR, instead of mapping to
other standardized format strings like "%lld" etc as seen on other known
systems.  Teach our snprintf.c to understand them.

This removes a lot of configure clutter, and should also allow 64-bit
numbers and other standard types to be used in localized messages
without casting.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/ME3P282MB3166F9D1F71F787929C0C7E7B6312%40ME3P282MB3166.AUSP282.PROD.OUTLOOK.COM
2024-12-04 15:05:38 +13:00
Masahiko Sawada
bb952c8c8b Allow specifying initial and maximum segment sizes for DSA.
Previously, the DSA segment size always started with 1MB and grew up
to DSA_MAX_SEGMENT_SIZE. It was inconvenient in certain scenarios,
such as when the caller desired a soft constraint on the total DSA
segment size, limiting it to less than 1MB.

This commit introduces the capability to specify the initial and
maximum DSA segment sizes when creating a DSA area, providing more
flexibility and control over memory usage.

Reviewed-by: John Naylor, Tomas Vondra
Discussion: https://postgr.es/m/CAD21AoAYGGC1ePjVX0H%2Bpp9rH%3D9vuPK19nNOiu12NprdV5TVJA%40mail.gmail.com
2024-03-27 11:43:29 +09:00
John Naylor
ee1b30f128 Add template for adaptive radix tree
This implements a radix tree data structure based on the design in
"The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases"
by Viktor Leis, Alfons Kemper, and ThomasNeumann, 2013. The main
technique that makes it adaptive is using several different node types,
each with a different capacity of elements, and a different algorithm
for accessing them. The nodes start small and grow/shrink as needed.

The main advantage over hash tables is efficient sorted iteration and
better memory locality when successive keys are lexicographically
close together. The implementation currently assumes 64-bit integer
keys, and traversing the tree is in general slower than a linear
probing hash table, so this is not a general-purpose associative array.

The paper describes two other techniques not implemented here,
namely "path compression" and "lazy expansion". These can further
reduce memory usage and speed up traversal, but the former would add
significant complexity and the latter requires storing the full key
with the value. We do trivially compress the path when leading bytes
of the key are zeros, however.

For value storage, we use "combined pointer/value slots", as
recommended in the paper. Values of size equal or smaller than the the
platform's pointer type are stored in the array of child pointers in
the last level node, while larger values are each stored in a separate
allocation. This is for now fixed at compile time, but it would be
fairly trivial to allow determining at runtime how variable-length
values are stored.

One innovation in our implementation compared to the ART paper is
decoupling the notion of node "size class" from "kind". The size
classes within a given node kind have the same underlying type, but
a variable capacity for children, so we can introduce additional node
sizes with little additional code.

To enable different use cases to specialize for different value types
and for shared/local memory, we use macro-templatized code generation
in the same manner as simplehash.h and sort_template.h.

Future commits will use this infrastructure for storing TIDs.

Patch by Masahiko Sawada and John Naylor, but a substantial amount of
credit is due to Andres Freund, whose proof-of-concept was a valuable
source of coding idioms and awareness of performance pitfalls, and
who reviewed earlier versions.

Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com
2024-03-07 12:40:11 +07:00
Bruce Momjian
29275b1d17 Update copyright for 2024
Reported-by: Michael Paquier

Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz

Backpatch-through: 12
2024-01-03 20:49:05 -05:00
Tom Lane
3b4ac33254 Avoid type cheats for invalid dsa_handles and dshash_table_handles.
Invent separate macros for "invalid" values of these types, so that
we needn't embed knowledge of their representations into calling code.
These are all zeroes anyway ATM, so this is not fixing any live bug,
but it makes the code cleaner and more future-proof.

I (tgl) also chose to move DSM_HANDLE_INVALID into dsm_impl.h,
since it seems like it should live beside the typedef for dsm_handle.

Hou Zhijie, Nathan Bossart, Kyotaro Horiguchi, Tom Lane

Discussion: https://postgr.es/m/OS0PR01MB5716860B1454C34E5B179B6694C99@OS0PR01MB5716.jpnprd01.prod.outlook.com
2023-01-25 11:48:38 -05:00
Bruce Momjian
c8e1ba736b Update copyright for 2023
Backpatch-through: 11
2023-01-02 15:00:37 -05:00
Bruce Momjian
27b77ecf9f Update copyright for 2022
Backpatch-through: 10
2022-01-07 19:04:57 -05:00
Bruce Momjian
ca3b37487b Update copyright for 2021
Backpatch-through: 9.5
2021-01-02 13:06:25 -05:00
Bruce Momjian
7559d8ebfa Update copyrights for 2020
Backpatch-through: update all files in master, backpatch legal files through 9.4
2020-01-01 12:21:45 -05:00
Tom Lane
79b94716e7 Remove unreferenced function declarations.
These seem to be leftovers from old patches, perhaps.

Masahiko Sawada

Discussion: https://postgr.es/m/CAD21AoDuAYsRb3Q9aobkFZ6DZMWxsyg4HOmgkwgeWNfSkTwGxw@mail.gmail.com
2019-07-05 19:28:45 -04:00
Tom Lane
8255c7a5ee Phase 2 pgindent run for v12.
Switch to 2.1 version of pg_bsd_indent.  This formats
multiline function declarations "correctly", that is with
additional lines of parameter declarations indented to match
where the first line's left parenthesis is.

Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
2019-05-22 13:04:48 -04:00
Bruce Momjian
97c39498e5 Update copyright for 2019
Backpatch-through: certain files through 9.4
2019-01-02 12:44:25 -05:00
Thomas Munro
f025bd2ddd Use size_t consistently in dsa.{ch}.
Takeshi Ideriha complained that there is a mixture of Size and size_t
in dsa.c and corresponding header.  Let's use size_t.  Back-patch to 10
where dsa.c landed, to make future back-patching easy.

Discussion: https://postgr.es/m/4E72940DA2BF16479384A86D54D0988A6F19ABD9%40G01JPEXMBKW04
2018-09-22 00:40:13 +12:00
Bruce Momjian
9d4649ca49 Update copyright for 2018
Backpatch-through: certain files through 9.3
2018-01-02 23:30:12 -05:00
Tom Lane
c7b8998ebb Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.

Commit e3860ffa4d wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code.  The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there.  BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs.  So the
net result is that in about half the cases, such comments are placed
one tab stop left of before.  This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.

Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.

This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 15:19:25 -04:00
Robert Haas
d9528604cc Remove inclusion of postgres.h from a few header files.
Thomas Munro, per project policy articuled by Andres Freund and
Tom Lane.

Discussion: http://postgr.es/m/CAEepm=2zCoeq3QxVwhS5DFeUh=yU6z81pbWMgfOB8OzyiBwxzw@mail.gmail.com
2017-03-08 08:18:12 -05:00
Robert Haas
16be2fd100 Make dsa_allocate interface more like MemoryContextAlloc.
A new function dsa_allocate_extended now takes flags which indicate
that huge allocations should be permitted, that out-of-memory
conditions should not throw an error, and/or that the returned memory
should be zero-filled, just like MemoryContextAllocateExtended.

Commit 9acb85597f, which added
dsa_allocate0, was broken because it failed to account for the
possibility that dsa_allocate() might return InvalidDsaPointer.
This fixes that problem along the way.

Thomas Munro, with some comment changes by me.

Discussion: http://postgr.es/m/CA+Tgmobt7CcF_uQP2UQwWmu4K9qCHehMJP9_9m1urwP8hbOeHQ@mail.gmail.com
2017-02-19 13:59:53 +05:30
Robert Haas
9acb85597f Add new function dsa_allocate0.
This does the same thing as dsa_allocate, except that the memory
is guaranteed to be zero-filled on return.

Dilip Kumar, adjusted by me.
2017-02-16 12:57:03 -05:00
Robert Haas
175ff6598e Fix possible crash reading pg_stat_activity.
With the old code, a backend that read pg_stat_activity without ever
having executed a parallel query might see a backend in the midst of
executing one waiting on a DSA LWLock, resulting in a crash.  The
solution is for backends to register the tranche at startup time, not
the first time a parallel query is executed.

Report by Andreas Seltenreich.  Patch by me, reviewed by Thomas Munro.
2017-01-05 12:27:09 -05:00
Bruce Momjian
1d25779284 Update copyright via script for 2017 2017-01-03 13:48:53 -05:00
Robert Haas
88f626f868 Fix more DSA problems uncovered by the buildfarm.
On 32-bit systems, don't try to use 64-bit DSA pointers, because the
computation of DSA_MAX_SEGMENT_SIZE overflows Size.

Cast 1 to Size before shifting it, so that the compiler doesn't
produce a result of the wrong width.

In passing, change one use of size_t to Size.
2016-12-05 10:38:08 -05:00
Robert Haas
670b3bc8f5 Try to fix some DSA-related compiler warnings.
Commit 13df76a537 was overconfident
about how portable %016lx is.  Some compilers complain because they
need %016llx, while platforms where DSA pointers are only 32 bits
get unhappy about using a 64-bit format for a 32-bit quantity.

Thomas Munro, per an off-list suggestion from me.
2016-12-05 10:01:08 -05:00
Robert Haas
767a9039d7 Fix thinko in b3427dade1. 2016-12-02 15:06:41 -05:00
Robert Haas
13df76a537 Introduce dynamic shared memory areas.
Programmers discovered decades ago that it was useful to have a simple
interface for allocating and freeing memory, which is why malloc() and
free() were invented.  Unfortunately, those handy tools don't work
with dynamic shared memory segments because those are specific to
PostgreSQL and are not necessarily mapped at the same address in every
cooperating process.  So invent our own allocator instead.  This makes
it possible for processes cooperating as part of parallel query
execution to allocate and free chunks of memory without having to
reserve them prior to the start of execution.  It could also be used
for longer lived objects; for example, we could consider storing data
for pg_stat_statements or the stats collector in shared memory using
these interfaces, rather than writing them to files.  Basically,
anything that needs shared memory but can't predict in advance how
much it's going to need might find this useful.

Thomas Munro and Robert Haas.  The original code (of mine) on which
Thomas based his work was actually designed to be a new backend-local
memory allocator for PostgreSQL, but that hasn't gone anywhere - or
not yet, anyway.  Thomas took that work and performed major
refactoring and extensive modifications to make it work with dynamic
shared memory, including the addition of appropriate locking.

Discussion: CA+TgmobkeWptGwiNa+SGFWsTLzTzD-CeLz0KcE-y6LFgoUus4A@mail.gmail.com
Discussion: CAEepm=1z5WLuNoJ80PaCvz6EtG9dN0j-KuHcHtU6QEfcPP5-qA@mail.gmail.com
2016-12-02 12:34:36 -05:00