Fix datum_image_*()'s inability to detect sign-extension variations

Functions such as hash_numeric() are not careful to use the correct
PG_RETURN_*() macro according to the return type of that function as
defined in pg_proc.  Because that function is meant to return int32,
when the hashed value exceeds 2^31, the 64-bit Datum value won't wrap to
a negative number, which means the Datum won't have the same value as it
would have had it been cast to int32 on a two's complement machine.  This
isn't harmless as both datum_image_eq() and datum_image_hash() may receive
a Datum that's been formed and deformed from a tuple in some cases, and
not in other cases.  When formed into a tuple, the Datum value will be
coerced into an integer according to the attlen as specified by the
TupleDesc.  This can result in two Datums that should be equal being
classed as not equal, which could result in (but not limited to) an error
such as:

ERROR:  could not find memoization table entry

Here we fix this by ensuring we cast the Datum value to a signed integer
according to the typLen specified in the datum_image_eq/datum_image_hash
function call before comparing or hashing.

Author: David Rowley <dgrowleyml@gmail.com>
Reported-by: Tender Wang <tndrwang@gmail.com>
Backpatch-through: 14
Discussion: https://postgr.es/m/CAHewXNmcXVFdB9_WwA8Ez0P+m_TQy_KzYk5Ri5dvg+fuwjD_yw@mail.gmail.com
This commit is contained in:
David Rowley 2026-03-30 16:16:39 +13:00
parent f1298a4c20
commit d29808e35d

View file

@ -258,8 +258,13 @@ datumIsEqual(Datum value1, Datum value2, bool typByVal, int typLen)
/*-------------------------------------------------------------------------
* datum_image_eq
*
* Compares two datums for identical contents, based on byte images. Return
* true if the two datums are equal, false otherwise.
* Compares two datums for identical contents when coerced to a signed integer
* of typLen bytes. Return true if the two datums are equal, false otherwise.
*
* The coercion is required as we're not always careful to use the correct
* PG_RETURN_* macro. If we didn't do this, a Datum that's been formed and
* deformed into a tuple may not have the same signed representation as the
* other datum value.
*-------------------------------------------------------------------------
*/
bool
@ -271,7 +276,21 @@ datum_image_eq(Datum value1, Datum value2, bool typByVal, int typLen)
if (typByVal)
{
result = (value1 == value2);
switch (typLen)
{
case sizeof(char):
result = (DatumGetChar(value1) == DatumGetChar(value2));
break;
case sizeof(int16):
result = (DatumGetInt16(value1) == DatumGetInt16(value2));
break;
case sizeof(int32):
result = (DatumGetInt32(value1) == DatumGetInt32(value2));
break;
default:
result = (value1 == value2);
break;
}
}
else if (typLen > 0)
{
@ -328,10 +347,11 @@ datum_image_eq(Datum value1, Datum value2, bool typByVal, int typLen)
/*-------------------------------------------------------------------------
* datum_image_hash
*
* Generate a hash value based on the binary representation of 'value'. Most
* use cases will want to use the hash function specific to the Datum's type,
* however, some corner cases require generating a hash value based on the
* actual bits rather than the logical value.
* Generate a hash value based on the binary representation of 'value' when
* represented as a signed integer of typLen bytes. Most use cases will want
* to use the hash function specific to the Datum's type, however, some corner
* cases require generating a hash value based on the actual bits rather than
* the logical value.
*-------------------------------------------------------------------------
*/
uint32
@ -341,7 +361,23 @@ datum_image_hash(Datum value, bool typByVal, int typLen)
uint32 result;
if (typByVal)
{
switch (typLen)
{
case sizeof(char):
value = CharGetDatum(DatumGetChar(value));
break;
case sizeof(int16):
value = Int16GetDatum(DatumGetInt16(value));
break;
case sizeof(int32):
value = Int32GetDatum(DatumGetInt32(value));
break;
/* Nothing needs done for 64-bit types */
}
result = hash_bytes((unsigned char *) &value, sizeof(Datum));
}
else if (typLen > 0)
result = hash_bytes((unsigned char *) DatumGetPointer(value), typLen);
else if (typLen == -1)