postgresql/src/include/utils
John Naylor ef3c3cf6d0 Perform radix sort on SortTuples with pass-by-value Datums
Radix sort can be much faster than quicksort, but for our purposes it
is limited to sequences of unsigned bytes. To make tuples with other
types amenable to this technique, several features of tuple comparison
must be accounted for, i.e. the sort key must be "normalized":

1. Signedness -- It's possible to modify a signed integer such that
it can be compared as unsigned. For example, a signed char has range
-128 to 127. If we cast that to unsigned char and add 128, the range
of values becomes 0 to 255 while preserving order.

2. Direction -- SQL allows specification of ASC or DESC. The
descending case is easily handled by taking the complement of the
unsigned representation.

3. NULL values -- NULLS FIRST and NULLS LAST must work correctly.

This commmit only handles the case where datum1 is pass-by-value
Datum (possibly abbreviated) that compares like an ordinary
integer. (Abbreviations of values of type "numeric" are a convenient
counterexample.) First, tuples are partitioned by nullness in the
correct NULL ordering. Then the NOT NULL tuples are sorted with radix
sort on datum1. For tiebreaks on subsequent sortkeys (including the
first sort key if abbreviated), we divert to the usual qsort.

ORDER BY queries on pre-warmed buffers are up to 2x faster on high
cardinality inputs with radix sort than the sort specializations added
by commit 697492434, so get rid of them. It's sufficient to fall back
to qsort_tuple() for small arrays. Moderately low cardinality inputs
show more modest improvents. Our qsort is strongly optimized for very
low cardinality inputs, but radix sort is usually equal or very close
in those cases.

The changes to the regression tests are caused by under-specified sort
orders, e.g. "SELECT a, b from mytable order by a;". For unstable
sorts, such as our qsort and this in-place radix sort, there is no
guarantee of the order of "b" within each group of "a".

The implementation is taken from ska_byte_sort() (Boost licensed),
which is similar to American flag sort (an in-place radix sort) with
modifications to make it better suited for modern pipelined CPUs.

The technique of normalization described above can also be extended
to the case of multiple keys. That is left for future work (Thanks
to Peter Geoghegan for the suggestion to look into this area).

Reviewed-by: Chengpeng Yan <chengpeng_yan@outlook.com>
Reviewed-by: zengman <zengman@halodbtech.com>
Reviewed-by: ChangAo Chen <cca5507@qq.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Chao Li <li.evan.chao@gmail.com> (earlier version)
Discussion: https://postgr.es/m/CANWCAZYzx7a7E9AY16Jt_U3+GVKDADfgApZ-42SYNiig8dTnFA@mail.gmail.com
2026-02-14 13:50:06 +07:00
..
.gitignore Fix build inconsistency due to the generation of wait-event code 2026-02-02 08:02:39 +09:00
acl.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
aclchk_internal.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
array.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
arrayaccess.h Refactor att_align_nominal() to improve performance. 2026-02-02 14:39:50 -05:00
ascii.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
attoptcache.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
backend_progress.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
backend_status.h Add backendType to PGPROC, replacing isRegularBackend 2026-02-04 13:06:04 +02:00
builtins.h Guard against unexpected dimensions of oidvector/int2vector. 2026-02-09 09:57:43 -05:00
bytea.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
cash.h Convert *GetDatum() and DatumGet*() macros to inline functions 2022-09-27 20:50:21 +02:00
catcache.h Clarify where various catcache.h dlist_nodes are used 2026-01-06 14:39:36 +13:00
combocid.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
conffiles.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
date.h Remove #include <math.h> where not needed 2026-01-15 19:09:47 +01:00
datetime.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
datum.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
dsa.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
elog.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
evtcache.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
expandeddatum.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
expandedrecord.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
float.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
fmgrtab.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
formatting.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
freepage.h Rename AssertVariableIsOfType to StaticAssertVariableIsOfType 2026-02-03 08:45:24 +01:00
funccache.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
geo_decls.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
guc.h Cleanup for log_min_messages changes in 38e0190ced 2026-02-11 16:38:18 +01:00
guc_hooks.h Allow log_min_messages to be set per process type 2026-02-09 13:23:10 +01:00
guc_tables.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
help_config.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
hsearch.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
index_selfuncs.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
inet.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
injection_point.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
inval.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
json.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
jsonb.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
jsonfuncs.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
jsonpath.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
logtape.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
lsyscache.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
memdebug.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
memutils.h Fix accidentally cast away qualifiers 2026-01-26 16:02:31 +01:00
memutils_internal.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
memutils_memorychunk.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
meson.build Fix build inconsistency due to the generation of wait-event code 2026-02-02 08:02:39 +09:00
multirangetypes.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
numeric.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
palloc.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
partcache.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
pg_crc.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
pg_locale.h Clean up ICU includes. 2026-01-06 17:19:51 -08:00
pg_locale_c.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
pg_lsn.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
pg_rusage.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
pgstat_internal.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
pgstat_kind.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
pidfile.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
plancache.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
portal.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
ps_status.h Speedup and increase usability of set proc title functions 2023-02-20 16:18:27 +13:00
queryenvironment.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
rangetypes.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
regproc.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
rel.h Allow Boolean reloptions to have ternary values 2026-01-21 20:06:01 +01:00
relcache.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
relfilenumbermap.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
relmapper.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
relptr.h Rename AssertVariableIsOfType to StaticAssertVariableIsOfType 2026-02-03 08:45:24 +01:00
reltrigger.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
resowner.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
rls.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
ruleutils.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
sampling.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
selfuncs.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
sharedtuplestore.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
skipsupport.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
snapmgr.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
snapshot.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
sortsupport.h Perform radix sort on SortTuples with pass-by-value Datums 2026-02-14 13:50:06 +07:00
spccache.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
syscache.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
timeout.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
timestamp.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
tuplesort.h Perform radix sort on SortTuples with pass-by-value Datums 2026-02-14 13:50:06 +07:00
tuplestore.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
typcache.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
tzparser.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
usercontext.h Perform logical replication actions as the table owner. 2023-04-04 11:25:23 -04:00
uuid.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
varbit.h Improve type handling of varlena structures 2026-02-11 07:33:24 +09:00
varlena.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
wait_classes.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
wait_event.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
xid8.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
xml.h Improve type handling of varlena structures 2026-02-11 07:33:24 +09:00