postgresql/src
Tom Lane a53e0ea782 In the Snowball dictionary, don't try to stem excessively-long words.
If the input word exceeds 1000 bytes, don't pass it to the stemmer;
just return it as-is after case folding.  Such an input is surely
not a word in any human language, so whatever the stemmer might
do to it would be pretty dubious in the first place.  Adding this
restriction protects us against a known recursion-to-stack-overflow
problem in the Turkish stemmer, and it seems like good insurance
against any other safety or performance issues that may exist in
the Snowball stemmers.  (I note, for example, that they contain no
CHECK_FOR_INTERRUPTS calls, so we really don't want them running
for a long time.)  The threshold of 1000 bytes is arbitrary.

An alternative definition could have been to treat such words as
stopwords, but that seems like a bigger break from the old behavior.

Per report from Egor Chindyaskin and Alexander Lakhin.
Thanks to Olly Betts for the recommendation to fix it this way.

Discussion: https://postgr.es/m/1661334672.728714027@f473.i.mail.ru
2022-08-31 10:42:05 -04:00
..
backend In the Snowball dictionary, don't try to stem excessively-long words. 2022-08-31 10:42:05 -04:00
bin pg_upgrade: Fix some minor code issues 2022-08-13 00:15:59 +02:00
common Inhibit mingw CRT's auto-globbing of command line arguments 2022-04-25 15:50:07 -04:00
fe_utils Clean up assorted failures under clang's -fsanitize=undefined checks. 2022-03-03 18:13:24 -05:00
include Repair rare failure of MULTIEXPR_SUBLINK subplans in inherited updates. 2022-08-27 12:11:20 -04:00
interfaces Add missing bad-PGconn guards in libpq entry points. 2022-08-15 15:40:07 -04:00
makefiles Select CFLAGS_SL at configure time, not in platform-specific Makefiles. 2019-10-21 12:32:35 -04:00
pl Translation updates 2022-08-08 12:39:52 +02:00
port Stamp 12.12. 2022-08-08 16:47:33 -04:00
template On NetBSD, force dynamic symbol resolution at postmaster start. 2022-08-30 17:29:13 -04:00
test Repair rare failure of MULTIEXPR_SUBLINK subplans in inherited updates. 2022-08-27 12:11:20 -04:00
timezone Update time zone data files to tzdata release 2022a. 2022-05-05 14:55:17 -04:00
tools Improve setup of environment values for commands in MSVC's vcregress.pl 2022-05-11 10:22:37 +09:00
tutorial tutorial: land height is "elevation", not "altitude" 2021-03-10 20:25:18 -05:00
.gitignore Convert cvsignore to gitignore, and add .gitignore for build targets. 2010-09-22 12:57:04 +02:00
DEVELOPERS Replace a couple of references to files that no longer exist in the source 2009-05-04 08:08:47 +00:00
Makefile Fix partial-build problems introduced by having more generated headers. 2018-04-09 16:42:10 -04:00
Makefile.global.in Fix prove_installcheck to use correct paths when used with PGXS 2021-07-01 08:47:04 -04:00
Makefile.shlib Fix pkg-config files for static linking 2021-09-06 09:43:18 +02:00
nls-global.mk NLS: Fix backend gettext triggers 2019-09-23 09:05:50 +02:00