postgresql/src
Tom Lane fb12aefaaf Clean up formatting.c's logic for matching constant strings.
seq_search(), which is used to match input substrings to constants
such as month and day names, had a lot of bizarre and unnecessary
behaviors.  It was mostly possible to avert our eyes from that before,
but we don't want to duplicate those behaviors in the upcoming patch
to allow recognition of non-English month and day names.  So it's time
to clean this up.  In particular:

* seq_search scribbled on the input string, which is a pretty dangerous
thing to do, especially in the badly underdocumented way it was done here.
Fortunately the input string is a temporary copy, but that was being made
three subroutine levels away, making it something easy to break
accidentally.  The behavior is externally visible nonetheless, in the form
of odd case-folding in error reports about unrecognized month/day names.
The scribbling is evidently being done to save a few calls to pg_tolower,
but that's such a cheap function (at least for ASCII data) that it's
pretty pointless to worry about.  In HEAD I switched it to be
pg_ascii_tolower to ensure it is cheap in all cases; but there are corner
cases in Turkish where this'd change behavior, so leave it as pg_tolower
in the back branches.

* seq_search insisted on knowing the case form (all-upper, all-lower,
or initcap) of the constant strings, so that it didn't have to case-fold
them to perform case-insensitive comparisons.  This likewise seems like
excessive micro-optimization, given that pg_tolower is certainly very
cheap for ASCII data.  It seems unsafe to assume that we know the case
form that will come out of pg_locale.c for localized month/day names, so
it's better just to define the comparison rule as "downcase all strings
before comparing".  (The choice between downcasing and upcasing is
arbitrary so far as English is concerned, but it might not be in other
locales, so follow citext's lead here.)

* seq_search also had a parameter that'd cause it to report a match
after a maximum number of characters, even if the constant string were
longer than that.  This was not actually used because no caller passed
a value small enough to cut off a comparison.  Replicating that behavior
for localized month/day names seems expensive as well as useless, so
let's get rid of that too.

* from_char_seq_search used the maximum-length parameter to truncate
the input string in error reports about not finding a matching name.
This leads to rather confusing reports in many cases.  Worse, it is
outright dangerous if the input string isn't all-ASCII, because we
risk truncating the string in the middle of a multibyte character.
That'd lead either to delivering an illegible error message to the
client, or to encoding-conversion failures that obscure the actual
data problem.  Get rid of that in favor of truncating at whitespace
if any (a suggestion due to Alvaro Herrera).

In addition to fixing these things, I const-ified the input string
pointers of DCH_from_char and its subroutines, to make sure there
aren't any other scribbling-on-input problems.

The risk of generating a badly-encoded error message seems like
enough of a bug to justify back-patching, so patch all supported
branches.

Discussion: https://postgr.es/m/29432.1579731087@sss.pgh.pa.us
2020-01-23 13:42:10 -05:00
..
backend Clean up formatting.c's logic for matching constant strings. 2020-01-23 13:42:10 -05:00
bin Fix pg_dump's sigTermHandler() to use _exit() not exit(). 2020-01-20 12:57:17 -05:00
common Tolerate EINVAL when calling fsync() on a directory. 2019-02-24 23:51:54 +13:00
fe_utils Fix translation of special characters in psql's LaTeX output modes. 2018-11-26 17:32:51 -05:00
include Add GUC variables for stat tracking and timeout as PGDLLIMPORT 2020-01-21 13:47:01 +09:00
interfaces libpq should expose GSS-related parameters even when not implemented. 2019-12-20 15:34:08 -05:00
makefiles Select CFLAGS_SL at configure time, not in platform-specific Makefiles. 2019-10-21 12:32:36 -04:00
pl Fix possible loss of sync between rectypeid and underlying PLpgSQL_type. 2019-12-26 15:19:39 -05:00
port In pgwin32_open, loop after ERROR_ACCESS_DENIED only if we can't stat. 2019-12-21 17:39:36 -05:00
template Select CFLAGS_SL at configure time, not in platform-specific Makefiles. 2019-10-21 12:32:36 -04:00
test Clean up formatting.c's logic for matching constant strings. 2020-01-23 13:42:10 -05:00
timezone Update time zone data files to tzdata release 2019c. 2019-09-20 19:54:00 -04:00
tools Handle spaces in OpenSSL install location for MSVC 2019-10-04 15:39:19 -04:00
tutorial Update copyright for 2018 2018-01-02 23:30:12 -05:00
.gitignore Convert cvsignore to gitignore, and add .gitignore for build targets. 2010-09-22 12:57:04 +02:00
DEVELOPERS Replace a couple of references to files that no longer exist in the source 2009-05-04 08:08:47 +00:00
Makefile Fix partial-build problems introduced by having more generated headers. 2018-04-09 16:42:10 -04:00
Makefile.global.in Select CFLAGS_SL at configure time, not in platform-specific Makefiles. 2019-10-21 12:32:36 -04:00
Makefile.shlib Ensure static libraries have correct mod time even if ranlib messes it up. 2018-11-29 15:53:44 -05:00
nls-global.mk nls-global.mk: search build dir for source files, too 2016-06-07 18:55:18 -04:00