postgresql/src/include
David Rowley c456e39113 Optimize tuple deformation
This commit includes various optimizations to improve the performance of
tuple deformation.

We now precalculate CompactAttribute's attcacheoff, which allows us to
remove the code from the deform routines which was setting the
attcacheoff.  Setting the attcacheoff is now handled by
TupleDescFinalize(), which must be called before the TupleDesc is used for
anything.  Having TupleDescFinalize() means we can store the first
attribute in the TupleDesc which does not have an offset cached.  That
allows us to add a dedicated deforming loop to deform all attributes up
to the final one with an attcacheoff set, or up to the first NULL
attribute, whichever comes first.

Here we also improve tuple deformation performance of tuples with NULLs.
Previously, if the HEAP_HASNULL bit was set in the tuple's t_infomask,
deforming would, one-by-one, check each and every bit in the NULL bitmap
to see if it was zero.  Now, we process the NULL bitmap 1 byte at a time
rather than 1 bit at a time to find the attnum with the first NULL.  We
can now deform the tuple without checking for NULLs up to just before that
attribute.

We also record the maximum attribute number which is guaranteed to exist
in the tuple, that is, has a NOT NULL constraint and isn't an
atthasmissing attribute.  When deforming only attributes prior to the
guaranteed attnum, we've no need to access the tuple's natt count.  As an
additional optimization, we only count fixed-width columns when
calculating the maximum guaranteed column, as this eliminates the need to
emit code to fetch byref types in the deformation loop for guaranteed
attributes.

Some locations in the code deform tuples that have yet to go through NOT
NULL constraint validation.  We're unable to perform the guaranteed
attribute optimization when that's the case.  This optimization is opt-in
via the TupleTableSlot using the TTS_FLAG_OBEYS_NOT_NULL_CONSTRAINTS
flag.

This commit also adds a more efficient way of populating the isnull
array by using a bit-wise SWAR trick which performs multiplication on the
inverse of the tuple's bitmap byte and masking out all but the lower bit
of each of the boolean's byte.  This results in much more optimal code
when compared to determining the NULLness via att_isnull().  8 isnull
elements are processed at once using this method, which means we need to
round the tts_isnull array size up to the next 8 bytes.  The palloc code
does this anyway, but the round-up needed to be formalized so as not to
overwrite the sentinel byte in MEMORY_CONTEXT_CHECKING builds.  Doing
this also allows the NULL-checking deforming loop to more efficiently
check the isnull array, rather than doing the bit-wise processing for each
attribute that att_isnull() does.

The level of performance improvement from these changes seems to vary
depending on the CPU architecture.  Apple's M chips seem particularly
fond of the changes, with some of the tested deform-heavy queries going
over twice as fast as before.  With x86-64, the speedups aren't quite as
large.  With tables containing only a small number of columns, the
speedups will be less.

Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CAApHDvpoFjaj3%2Bw_jD5uPnGazaw41A71tVJokLDJg2zfcigpMQ%40mail.gmail.com
2026-03-16 11:46:00 +13:00
..
access Optimize tuple deformation 2026-03-16 11:46:00 +13:00
archive Update copyright for 2026 2026-01-01 13:24:10 -05:00
backup Update copyright for 2026 2026-01-01 13:24:10 -05:00
bootstrap Simplify creation of built-in functions with non-default ACLs. 2026-03-05 17:43:09 -05:00
catalog Move fake LSN infrastructure out of GiST. 2026-03-13 19:38:17 -04:00
commands Optimize COPY FROM (FORMAT {text,csv}) using SIMD. 2026-03-13 11:07:32 -05:00
common Use fallthrough attribute instead of comment 2026-02-19 08:51:12 +01:00
datatype Update copyright for 2026 2026-01-01 13:24:10 -05:00
executor Optimize tuple deformation 2026-03-16 11:46:00 +13:00
fe_utils pg_dumpall: Fix handling of conflicting options. 2026-03-06 14:00:04 -06:00
foreign CREATE SUBSCRIPTION ... SERVER. 2026-03-06 08:27:56 -08:00
jit Fix typos and inconsistencies in code and comments 2026-01-05 09:19:15 +09:00
lib Make use of pg_popcount() in more places. 2026-02-23 09:26:00 -06:00
libpq Don't include proc.h in shm_mq.h 2026-02-27 10:53:47 +01:00
mb Replace pg_mblen() with bounds-checked versions. 2026-02-09 12:44:04 +13:00
nodes Change copyObject() to use typeof_unqual 2026-03-13 07:06:57 +01:00
optimizer Convert NOT IN sublinks to anti-joins when safe 2026-03-12 09:45:18 +09:00
parser Introduce the REPACK command 2026-03-10 19:56:39 +01:00
partitioning Update copyright for 2026 2026-01-01 13:24:10 -05:00
pch Update copyright for 2026 2026-01-01 13:24:10 -05:00
port Fix some -Wcast-qual warnings 2026-02-27 21:57:33 +01:00
portability instrumentation: Drop INSTR_TIME_SET_CURRENT_LAZY macro 2026-02-26 10:39:29 -05:00
postmaster Fix some -Wcast-qual warnings 2026-02-27 21:57:33 +01:00
regex Update copyright for 2026 2026-01-01 13:24:10 -05:00
replication Avoid including utils/timestamp.h in conflict.h. 2026-02-23 10:19:05 +05:30
rewrite Update copyright for 2026 2026-01-01 13:24:10 -05:00
snowball Update to latest Snowball sources. 2026-01-05 15:22:37 -05:00
statistics Add support for "mcv" in pg_restore_extended_stats() 2026-01-29 12:14:08 +09:00
storage bufmgr: Remove the, now obsolete, BM_JUST_DIRTIED 2026-03-11 14:58:29 -04:00
tcop Introduce the REPACK command 2026-03-10 19:56:39 +01:00
tsearch Replace pg_mblen() with bounds-checked versions. 2026-02-09 12:44:04 +13:00
utils Fix bug due to confusion about what IsMVCCSnapshot means 2026-03-13 13:53:19 -04:00
.gitignore Use <stdint.h> and <inttypes.h> for c.h integers. 2024-12-04 15:05:38 +13:00
c.h Make typeof and typeof_unqual fallback definitions work on C++11 2026-03-15 07:36:27 +01:00
fmgr.h Improve type handling of varlena structures 2026-02-11 07:33:24 +09:00
funcapi.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
getopt_long.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
Makefile Fix build inconsistency due to the generation of wait-event code 2026-02-02 08:02:39 +09:00
meson.build Ensure that all three build methods install the same set of files. 2026-02-16 15:20:15 -05:00
miscadmin.h Add password expiration warnings. 2026-02-11 10:36:15 -06:00
pg_config.h.in Change copyObject() to use typeof_unqual 2026-03-13 07:06:57 +01:00
pg_config_manual.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
pg_getopt.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
pg_trace.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
pgstat.h Don't include wait_event.h in pgstat.h 2026-03-06 16:24:58 +01:00
pgtar.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
pgtime.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
port.h strnlen() is now required 2026-01-08 08:51:20 +01:00
postgres.h Remove Int8GetDatum function 2026-03-11 10:46:08 +01:00
postgres_ext.h Move pg_int64 back to postgres_ext.h 2025-09-16 10:48:56 +02:00
postgres_fe.h Update copyright for 2026 2026-01-01 13:24:10 -05:00
varatt.h Improve type handling of varlena structures 2026-02-11 07:33:24 +09:00
windowapi.h Update copyright for 2026 2026-01-01 13:24:10 -05:00