postgresql/src/backend
Noah Misch 14b1fd6176 Fix SUBSTRING() for toasted multibyte characters.
Commit 1e7fe06c10 changed
pg_mbstrlen_with_len() to ereport(ERROR) if the input ends in an
incomplete character.  Most callers want that.  text_substring() does
not.  It detoasts the most bytes it could possibly need to get the
requested number of characters.  For example, to extract up to 2 chars
from UTF8, it needs to detoast 8 bytes.  In a string of 3-byte UTF8
chars, 8 bytes spans 2 complete chars and 1 partial char.

Fix this by replacing this pg_mbstrlen_with_len() call with a string
traversal that differs by stopping upon finding as many chars as the
substring could need.  This also makes SUBSTRING() stop raising an
encoding error if the incomplete char is past the end of the substring.
This is consistent with the general philosophy of the above commit,
which was to raise errors on a just-in-time basis.  Before the above
commit, SUBSTRING() never raised an encoding error.

SUBSTRING() has long been detoasting enough for one more char than
needed, because it did not distinguish exclusive and inclusive end
position.  For avoidance of doubt, stop detoasting extra.

Back-patch to v14, like the above commit.  For applications using
SUBSTRING() on non-ASCII column values, consider applying this to your
copy of any of the February 12, 2026 releases.

Reported-by: SATŌ Kentarō <ranvis@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Bug: #19406
Discussion: https://postgr.es/m/19406-9867fddddd724fca@postgresql.org
Backpatch-through: 14
2026-02-14 12:16:21 -08:00
..
access Guard against unexpected dimensions of oidvector/int2vector. 2026-02-09 09:57:44 -05:00
bootstrap Rethink definition of pg_attribute.attcompression. 2021-05-27 13:24:27 -04:00
catalog Harden _int_matchsel() against being attached to the wrong operator. 2026-02-09 10:14:22 -05:00
commands Harden _int_matchsel() against being attached to the wrong operator. 2026-02-09 10:14:22 -05:00
executor Fix bogus ctid requirement for dummy-root partitioned targets 2026-01-23 10:23:10 +09:00
foreign Restrict accesses to non-system views and foreign tables during pg_dump. 2024-08-05 06:05:23 -07:00
jit jit: Add missing inline pass for LLVM >= 17. 2026-01-22 16:06:06 +13:00
lib Accommodate very large dshash tables. 2024-12-17 15:24:45 -06:00
libpq Fix build breakage on Solaris-alikes with late-model GCC. 2025-07-23 15:44:29 -04:00
main Fix elog(FATAL) before PostmasterMain() or just after fork(). 2024-12-10 13:52:02 -08:00
nodes Build whole-row Vars the same way during parsing and planning. 2025-03-12 11:47:19 -04:00
optimizer Allow indexscans on partial hash indexes with implied quals. 2025-11-27 13:09:59 -05:00
parser Fix possible incorrect column reference in ERROR message 2026-01-09 11:04:39 +13:00
partitioning Fix creation of partition descriptor during concurrent detach+drop 2024-08-12 18:17:56 -04:00
po Translation updates 2026-02-08 15:11:05 +01:00
port Don't treat EINVAL from semget() as a hard failure. 2025-08-13 11:59:47 -04:00
postmaster Fix snapshot handling bug in recent BRIN fix 2025-11-04 20:31:43 +01:00
regex Fix recently-exposed portability issue in regex optimization. 2024-11-17 14:14:06 -05:00
replication Fix error message related to end TLI in backup manifest 2026-01-18 17:25:04 +09:00
rewrite Avoid rewriting data-modifying CTEs more than once. 2025-11-29 12:34:45 +00:00
snowball Avoid null pointer dereference crash after OOM in Snowball stemmers. 2025-02-18 21:24:12 -05:00
statistics Fix security checks in selectivity estimation functions. 2025-08-11 09:12:09 +01:00
storage Fix segfault from releasing locks in detached DSM segments 2026-01-16 13:20:11 +09:00
tcop Check for CREATE privilege on the schema in CREATE STATISTICS. 2025-11-10 09:00:00 -06:00
tsearch Require superuser to install a non-built-in selectivity estimator. 2026-02-09 10:07:31 -05:00
utils Fix SUBSTRING() for toasted multibyte characters. 2026-02-14 12:16:21 -08:00
.gitignore Add .gitignore entries for AIX-specific intermediate build artifacts. 2015-07-08 20:44:22 -04:00
common.mk Remove PARTIAL_LINKING build mode. 2018-03-30 17:33:04 -07:00
Makefile Use sort_template.h for qsort_tuple() and qsort_ssup(). 2021-03-03 17:02:32 +13:00
nls.mk Translation updates 2021-09-20 16:23:13 +02:00