postgresql/src/common
Michael Paquier 13c8adf90e Fix buffer overrun in unicode string normalization with empty input
PostgreSQL 13 and newer versions are directly impacted by that through
the SQL function normalize(), which would cause a call of this function
to write one byte past its allocation if using in input an empty
string after recomposing the string with NFC and NFKC.  Older versions
(v10~v12) are not directly affected by this problem as the only code
path using normalization is SASLprep in SCRAM authentication that
forbids the case of an empty string, but let's make the code more robust
anyway there so as any out-of-core callers of this function are covered.

The solution chosen to fix this issue is simple, with the addition of a
fast-exit path if the decomposed string is found as empty.  This would
only happen for an empty string as at its lowest level a codepoint would
be decomposed as itself if it has no entry in the decomposition table or
if it has a decomposition size of 0.

Some tests are added to cover this issue in v13~.  Note that an empty
string has always been considered as normalized (grammar "IS NF[K]{C,D}
NORMALIZED", through the SQL function is_normalized()) for all the
operations allowed (NFC, NFD, NFKC and NFKD) since this feature has been
introduced as of 2991ac5.  This behavior is unchanged but some tests are
added in v13~ to check after that.

I have also checked "make normalization-check" in src/common/unicode/,
while on it (works in 13~, and breaks in older stable branches
independently of this commit).

The release notes should just mention this commit for v13~.

Reported-by: Matthijs van der Vleuten
Discussion: https://postgr.es/m/17277-0c527a373794e802@postgresql.org
Backpatch-through: 10
2021-11-11 15:01:54 +09:00
..
unicode Initial pgindent and pgperltidy run for v13. 2020-05-14 13:06:50 -04:00
.gitignore Replace the data structure used for keyword lookup. 2019-01-06 17:02:57 -05:00
archive.c Move routine building restore_command to src/common/ 2020-03-24 12:13:36 +09:00
base64.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
checksum_helper.c Add checksum helper functions. 2020-04-03 11:52:43 -04:00
config_info.c Simplify passing of configure arguments to pg_config 2020-02-10 19:23:41 +01:00
controldata_utils.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
d2s.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
d2s_full_table.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
d2s_intrinsics.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
digit_table.h Change floating-point output format for improved performance. 2019-02-13 15:20:33 +00:00
encnames.c Rationalize code placement between wchar.c, encnames.c, and mbutils.c. 2020-01-16 18:08:21 -05:00
exec.c Add -c/--restore-target-wal to pg_rewind 2020-04-01 10:57:03 +09:00
f2s.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
fe_memutils.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
file_perm.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
file_utils.c Change client-side fsync_fname() to report errors fatally 2020-02-24 16:51:26 +01:00
hashfn.c Dial back -Wimplicit-fallthrough to level 3 2020-05-13 15:31:14 -04:00
ip.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
jsonapi.c Fix incautious handling of possibly-miscoded strings in client code. 2021-06-07 14:15:25 -04:00
keywords.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
kwlookup.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
link-canary.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
logging.c Fix command-line colorization on Windows with VT100-compatible environments 2020-03-02 15:45:34 +09:00
Makefile Move frontend-side archive APIs from src/common/ to src/fe_utils/ 2020-06-11 15:48:56 +09:00
md5.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
pg_lzcompress.c Second thoughts on TOAST decompression. 2020-11-02 11:25:18 -05:00
pgfnames.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
protocol_openssl.c Move OpenSSL routines for min/max protocol setting to src/common/ 2020-01-17 10:06:17 +09:00
psprintf.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
relpath.c Add declaration-level assertions for compile-time checks 2020-02-03 14:48:42 +09:00
restricted_token.c Improve error messages after LoadLibrary() 2020-04-13 10:24:46 +02:00
rmtree.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
ryu_common.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
saslprep.c Add support for other normal forms to Unicode normalization API 2020-03-24 10:02:46 +01:00
scram-common.c Initial pgindent and pgperltidy run for v13. 2020-05-14 13:06:50 -04:00
sha2.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
sha2_openssl.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
string.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
stringinfo.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
unicode_norm.c Fix buffer overrun in unicode string normalization with empty input 2021-11-11 15:01:54 +09:00
username.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
wait_error.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
wchar.c Fix incautious handling of possibly-miscoded strings in client code. 2021-06-07 14:15:25 -04:00