Commit 803716013d installed a safeguard against loading plpython2
and plpython3 at the same time, but asserted that both could still be
used in the same database, just not in the same session. However, that's
not actually all that practical because dumping and reloading will fail
(since both libraries necessarily get loaded into the restoring session).
pg_upgrade is even worse, because it checks for missing libraries by
loading every .so library mentioned in the entire installation into one
session, so that you can have only one across the whole cluster.
We can improve matters by not throwing the error immediately in _PG_init,
but only when and if we're asked to do something that requires calling
into libpython. This ameliorates both of the above situations, since
while execution of CREATE LANGUAGE, CREATE FUNCTION, etc will result in
loading plpython, it isn't asked to do anything interesting (at least
not if check_function_bodies is off, as it will be during a restore).
It's possible that this opens some corner-case holes in which a crash
could be provoked with sufficient effort. However, since plpython
only exists as an untrusted language, any such crash would require
superuser privileges, making it "don't do that" not a security issue.
To reduce the hazards in this area, the error is still FATAL when it
does get thrown.
Per a report from Paul Jones. Back-patch to 9.2, which is as far back
as the patch applies without work. (It could be made to work in 9.1,
but given the lack of previous complaints, I'm disinclined to expend
effort so far back. We've been pretty desultory about support for
Python 3 in 9.1 anyway.)
The error message wording for AttributeError has changed in Python 3.5.
For the plpython_error test, add a new expected file. In the
plpython_subtransaction test, we didn't really care what the exception
is, only that it is something coming from Python. So use a generic
exception instead, which has a message that doesn't vary across
versions.
PLyString_ToComposite() blithely overwrote proc->result.out.d, even though
for a composite result type the other union variant proc->result.out.r is
the one that should be valid. This could result in a crash if out.r had
in fact been filled in (proc->result.is_rowtype == 1) and then somebody
later attempted to use that data; as per bug #13579 from Paweł Michalak.
Just to add insult to injury, it didn't work for RECORD results anyway,
because record_in() would refuse the case.
Fix by doing the I/O function lookup in a local PLyTypeInfo variable,
as we were doing already in PLyObject_ToComposite(). This is not a great
technique because any fn_extra data allocated by the input function will
be leaked permanently (thanks to using TopMemoryContext as fn_mcxt).
But that's a pre-existing issue that is much less serious than a crash,
so leave it to be fixed separately.
This bug would be a potential security issue, except that plpython is
only available to superusers and the crash requires coding the function
in a way that didn't work before today's patches.
Add regression test cases covering all the supported methods of converting
composite results.
Back-patch to 9.1 where the faulty coding was introduced.
This reverts commit 16304a0134, except
for its changes in src/port/snprintf.c; as well as commit
cac18a76bb which is no longer needed.
Fujii Masao reported that the previous commit caused failures in psql on
OS X, since if one exits the pager program early while viewing a query
result, psql sees an EPIPE error from fprintf --- and the wrapper function
thought that was reason to panic. (It's a bit surprising that the same
does not happen on Linux.) Further discussion among the security list
concluded that the risk of other such failures was far too great, and
that the one-size-fits-all approach to error handling embodied in the
previous patch is unlikely to be workable.
This leaves us again exposed to the possibility of the type of failure
envisioned in CVE-2015-3166. However, that failure mode is strictly
hypothetical at this point: there is no concrete reason to believe that
an attacker could trigger information disclosure through the supposed
mechanism. In the first place, the attack surface is fairly limited,
since so much of what the backend does with format strings goes through
stringinfo.c or psprintf(), and those already had adequate defenses.
In the second place, even granting that an unprivileged attacker could
control the occurrence of ENOMEM with some precision, it's a stretch to
believe that he could induce it just where the target buffer contains some
valuable information. So we concluded that the risk of non-hypothetical
problems induced by the patch greatly outweighs the security risks.
We will therefore revert, and instead undertake closer analysis to
identify specific calls that may need hardening, rather than attempt a
universal solution.
We have kept the portion of the previous patch that improved snprintf.c's
handling of errors when it calls the platform's sprintf(). That seems to
be an unalloyed improvement.
Security: CVE-2015-3166
All known standard library implementations of these functions can fail
with ENOMEM. A caller neglecting to check for failure would experience
missing output, information exposure, or a crash. Check return values
within wrappers and code, currently just snprintf.c, that bypasses the
wrappers. The wrappers do not return after an error, so their callers
need not check. Back-patch to 9.0 (all supported versions).
Popular free software standard library implementations do take pains to
bypass malloc() in simple cases, but they risk ENOMEM for floating point
numbers, positional arguments, large field widths, and large precisions.
No specification demands such caution, so this commit regards every call
to a printf family function as a potential threat.
Injecting the wrappers implicitly is a compromise between patch scope
and design goals. I would prefer to edit each call site to name a
wrapper explicitly. libpq and the ECPG libraries would, ideally, convey
errors to the caller rather than abort(). All that would be painfully
invasive for a back-patched security fix, hence this compromise.
Security: CVE-2015-3166
This test previously used a data value containing U+0080, and would
therefore fail if the database encoding didn't have an equivalent to
that; which only about half of our supported server encodings do.
We could fall back to using some plain-ASCII character, but that seems
like it's losing most of the point of the test. Instead switch to using
U+00A0 (no-break space), which translates into all our supported encodings
except the four in the EUC_xx family.
Per buildfarm testing. Back-patch to 9.1, which is as far back as this
test is expected to succeed everywhere. (9.0 has the test, but without
back-patching some 9.1 code changes we could not expect to get consistent
results across platforms anyway.)
Back-patch commit d0765d50f4 into 9.3
and 9.2, which is as far back as we previously bothered to adjust the
regression tests for Python 3.3. Per gripe from Honza Horak.
As of Xcode 5.0, Apple isn't including the Python framework as part of the
SDK-level files, which means that linking to it might fail depending on
whether Xcode thinks you've selected a specific SDK version. According to
their Tech Note 2328, they've basically deprecated the framework method of
linking to libpython and are telling people to link to the shared library
normally. (I'm pretty sure this is in direct contradiction to the advice
they were giving a few years ago, but whatever.) Testing says that this
approach works fine at least as far back as OS X 10.4.11, so let's just
rip out the framework special case entirely. We do still need a special
case to decide that OS X provides a shared library at all, unfortunately
(I wonder why the distutils check doesn't work ...). But this is still
less of a special case than before, so it's fine.
Back-patch to all supported branches, since we'll doubtless be hearing
about this more as more people update to recent Xcode.
This was not changed in HEAD, but will be done later as part of a
pgindent run. Future pgindent runs will also do this.
Report by Tom Lane
Backpatch through all supported branches, but not HEAD
We must increment the refcount on "plntup" as soon as we have the
reference, not sometime later. Otherwise, if an error is thrown in
between, the Py_XDECREF(plntup) call in the PG_CATCH block removes a
refcount we didn't add, allowing the object to be freed even though
it's still part of the plpython function's parsetree.
This appears to be the cause of crashes seen on buildfarm member
prairiedog. It's a bit surprising that we've not seen it fail repeatably
before, considering that the regression tests have been exercising the
faulty code path since 2009.
The real-world impact is probably minimal, since it's unlikely anyone would
be provoking the "TD["new"] is not a dictionary" error in production, and
that's the only case that is actually wrong. Still, it's a bug affecting
the regression tests, so patch all supported branches.
In passing, remove dead variable "plstr", and demote "platt" to a local
variable inside the PG_TRY block, since we don't need to clean it up
in the PG_CATCH path.
The primary role of PL validators is to be called implicitly during
CREATE FUNCTION, but they are also normal functions that a user can call
explicitly. Add a permissions check to each validator to ensure that a
user cannot use explicit validator calls to achieve things he could not
otherwise achieve. Back-patch to 8.4 (all supported versions).
Non-core procedural language extensions ought to make the same two-line
change to their own validators.
Andres Freund, reviewed by Tom Lane and Noah Misch.
Security: CVE-2014-0061
Similar to 2cfb1c6f77, the order in which
dictionary elements are printed is not reliable. This reappeared in the
tests of the string representation of result objects. Reduce the test
case to one result set column so that there is no question of order.
plpgsql often just remembers SPI-result tuple tables in local variables,
and has no mechanism for freeing them if an ereport(ERROR) causes an escape
out of the execution function whose local variable it is. In the original
coding, that wasn't a problem because the tuple table would be cleaned up
when the function's SPI context went away during transaction abort.
However, once plpgsql grew the ability to trap exceptions, repeated
trapping of errors within a function could result in significant
intra-function-call memory leakage, as illustrated in bug #8279 from
Chad Wagner.
We could fix this locally in plpgsql with a bunch of PG_TRY/PG_CATCH
coding, but that would be tedious, probably slow, and prone to bugs of
omission; moreover it would do nothing for similar risks elsewhere.
What seems like a better plan is to make SPI itself responsible for
freeing tuple tables at subtransaction abort. This patch attacks the
problem that way, keeping a list of live tuple tables within each SPI
function context. Currently, such freeing is automatic for tuple tables
made within the failed subtransaction. We might later add a SPI call to
mark a tuple table as not to be freed this way, allowing callers to opt
out; but until someone exhibits a clear use-case for such behavior, it
doesn't seem worth bothering.
A very useful side-effect of this change is that SPI_freetuptable() can
now defend itself against bad calls, such as duplicate free requests;
this should make things more robust in many places. (In particular,
this reduces the risks involved if a third-party extension contains
now-redundant SPI_freetuptable() calls in error cleanup code.)
Even though the leakage problem is of long standing, it seems imprudent
to back-patch this into stable branches, since it does represent an API
semantics change for SPI users. We'll patch this in 9.3, but live with
the leakage in older branches.
If an error is thrown out of the datatype I/O functions called by this
function, we need to do subtransaction cleanup, which the previous coding
entirely failed to do. Fortunately, both existing callers of this function
already have proper cleanup logic, so re-throwing the exception is enough.
Also, postpone creation of the resultset tupdesc until after the I/O
conversions are complete, so that we won't leak memory in TopMemoryContext
when such an error happens.
With -Wtype-limits, gcc correctly points out that size_t can never be < 0.
Backpatch to 9.3 and 9.2. It's been like this forever, but in <= 9.1 you got
a lot other warnings with -Wtype-limits anyway (at least with my version of
gcc).
Andres Freund
Memory was allocated based on the sizeof a type that was not the type of
the pointer that the result was being assigned to. The types happen to
be of the same size, but it's still wrong.
This confused Cygwin's make because of the colon in the path. The
DLL isn't likely to change under us so preserving the dependency
doesn't gain us much, and it's useful to be able to do a native
Windows build with the Cygwin mingw toolset.
Noah Misch.
If a PL/Python function raises an SPIError (or one if its subclasses)
directly with python's raise statement, treat it the same as an SPIError
generated internally. In particular, if the user sets the sqlstate
attribute, preserve that.
Oskari Saarenmaa and Jan Urbański, reviewed by Karl O. Pinc.
plpython tried to use a single cache entry for a trigger function, but it
needs a separate cache entry for each table the trigger is applied to,
because there is table-dependent data in there. This was done correctly
before 9.1, but commit 46211da1b8 broke it
by simplifying the lookup key from "function OID and triggered table OID"
to "function OID and is-trigger boolean". Go back to using both OIDs
as the lookup key. Per bug report from Sandro Santilli.
Andres Freund
The PL/Python build on OS X was previously hardcoded to use the system
installation of Python, ignoring whatever was specified to configure.
Except that it would use the header files from configure, which could
lead to mismatches. It was not possible to build against a custom
Python installation.
Now, we check in configure how the specified Python installation was
built and use that, supporting framework and non-framework builds.
This reverts commit be0dfbad36.
The previous information that Py_RETURN_TRUE and Py_RETURN_FALSE are
supported in Python 2.3 is wrong. They require Python 2.4. Update the
comment about that.
This was used in a time when a shared libperl or libpython was difficult
to come by. That is obsolete, and the idea behind the flag was never
fully portable anyway and will likely fail on more modern CPU
architectures.
Currently, we are making mangled copies of plpython/{expected,sql} to
plpython/python3/{expected,sql}, and run the tests in
plpython/python3. This has the disadvantage that the regression.diffs
file, if any, ends up in plpython/python3, which is not the normal
location. If we instead make the mangled copies in
plpython/{expected,sql}/python3/, we can run the tests from the normal
directory, regression.diffs ends up the normal place, and the
pg_regress invocation also becomes a lot simpler. It's also more
obvious at run time what's going on, because the tests end up being
named "python3/something" in the test output.
Commit 2cfb1c6f77 fixed some issues caused
by Python 3.3 choosing to iterate through dict entries in a different order
than before. But here's another one: the test cases adjusted here made two
bad entries in a dict and expected the one complained of would always be
the same.
Possibly this should be back-patched further than 9.2, but there seems
little point unless the earlier fix is too.
This reduces unnecessary exposure of other headers through htup.h, which
is very widely included by many files.
I have chosen to move the function prototypes to the new file as well,
because that means htup.h no longer needs to include tupdesc.h. In
itself this doesn't have much effect in indirect inclusion of tupdesc.h
throughout the tree, because it's also required by execnodes.h; but it's
something to explore in the future, and it seemed best to do the htup.h
change now while I'm busy with it.
We used to convert the unicode object directly to a string in the server
encoding by calling Python's PyUnicode_AsEncodedString function. In other
words, we used Python's routines to do the encoding. However, that has a
few problems. First of all, it required keeping a mapping table of Python
encoding names and PostgreSQL encodings. But the real killer was that Python
doesn't support EUC_TW and MULE_INTERNAL encodings at all.
Instead, convert the Python unicode object to UTF-8, and use PostgreSQL's
encoding conversion functions to convert from UTF-8 to server encoding. We
were already doing the same in the other direction in PLyUnicode_FromString,
so this is more consistent, too.
Note: This makes SQL_ASCII to behave more leniently. We used to map
SQL_ASCII to Python's 'ascii', which on Python means strict 7-bit ASCII
only, so you got an error if the python string contained anything but pure
ASCII. You no longer get an error; you get the UTF-8 representation of the
string instead.
Backpatch to 9.0, where these conversions were introduced.
Jan Urbański
That caused the plpython_unicode regression test to fail on SQL_ASCII
encoding, as evidenced by the buildfarm. The reason is that with the patch,
you don't get the detail in the error message that you got before. That
detail is actually very informative, so rather than just adjust the expected
output, let's revert that part of the patch for now to make the buildfarm
green again, and figure out some other way to avoid the recursion of
PLy_elog() that doesn't lose the detail.
Windows encodings, "win1252" and so forth, are named differently in Python,
like "cp1252". Also, if the PyUnicode_AsEncodedString() function call fails
for some reason, use a plain ereport(), not a PLy_elog(), to report that
error. That avoids recursion and crash, if PLy_elog() tries to call
PLyUnicode_Bytes() again.
This fixes bug reported by Asif Naeem. Backpatch down to 9.0, before that
plpython didn't even try these conversions.
Jan Urbański, with minor comment improvements by me.
The string representation of ImportError changed. Remove printing
that; it's not necessary for the test.
The order in which members of a dict are printed changed. But this
was always implementation-dependent, so we have just been lucky for a
long time. Do the printing the hard way to ensure sorted order.