XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-12 20:17:06 -05:00
|
|
|
/*
|
|
|
|
|
* pg_crc.h
|
|
|
|
|
*
|
2005-06-02 01:55:29 -04:00
|
|
|
* PostgreSQL CRC support
|
|
|
|
|
*
|
|
|
|
|
* See Ross Williams' excellent introduction
|
|
|
|
|
* A PAINLESS GUIDE TO CRC ERROR DETECTION ALGORITHMS, available from
|
2019-10-16 02:10:14 -04:00
|
|
|
* http://ross.net/crc/ or several other net sites.
|
2005-06-02 01:55:29 -04:00
|
|
|
*
|
Switch to CRC-32C in WAL and other places.
The old algorithm was found to not be the usual CRC-32 algorithm, used by
Ethernet et al. We were using a non-reflected lookup table with code meant
for a reflected lookup table. That's a strange combination that AFAICS does
not correspond to any bit-wise CRC calculation, which makes it difficult to
reason about its properties. Although it has worked well in practice, seems
safer to use a well-known algorithm.
Since we're changing the algorithm anyway, we might as well choose a
different polynomial. The Castagnoli polynomial has better error-correcting
properties than the traditional CRC-32 polynomial, even if we had
implemented it correctly. Another reason for picking that is that some new
CPUs have hardware support for calculating CRC-32C, but not CRC-32, let
alone our strange variant of it. This patch doesn't add any support for such
hardware, but a future patch could now do that.
The old algorithm is kept around for tsquery and pg_trgm, which use the
values in indexes that need to remain compatible so that pg_upgrade works.
While we're at it, share the old lookup table for CRC-32 calculation
between hstore, ltree and core. They all use the same table, so might as
well.
2014-11-04 04:35:15 -05:00
|
|
|
* We have three slightly different variants of a 32-bit CRC calculation:
|
|
|
|
|
* CRC-32C (Castagnoli polynomial), CRC-32 (Ethernet polynomial), and a legacy
|
|
|
|
|
* CRC-32 version that uses the lookup table in a funny way. They all consist
|
|
|
|
|
* of four macros:
|
2005-06-02 01:55:29 -04:00
|
|
|
*
|
Switch to CRC-32C in WAL and other places.
The old algorithm was found to not be the usual CRC-32 algorithm, used by
Ethernet et al. We were using a non-reflected lookup table with code meant
for a reflected lookup table. That's a strange combination that AFAICS does
not correspond to any bit-wise CRC calculation, which makes it difficult to
reason about its properties. Although it has worked well in practice, seems
safer to use a well-known algorithm.
Since we're changing the algorithm anyway, we might as well choose a
different polynomial. The Castagnoli polynomial has better error-correcting
properties than the traditional CRC-32 polynomial, even if we had
implemented it correctly. Another reason for picking that is that some new
CPUs have hardware support for calculating CRC-32C, but not CRC-32, let
alone our strange variant of it. This patch doesn't add any support for such
hardware, but a future patch could now do that.
The old algorithm is kept around for tsquery and pg_trgm, which use the
values in indexes that need to remain compatible so that pg_upgrade works.
While we're at it, share the old lookup table for CRC-32 calculation
between hstore, ltree and core. They all use the same table, so might as
well.
2014-11-04 04:35:15 -05:00
|
|
|
* INIT_<variant>(crc)
|
|
|
|
|
* Initialize a CRC accumulator
|
|
|
|
|
*
|
|
|
|
|
* COMP_<variant>(crc, data, len)
|
|
|
|
|
* Accumulate some (more) bytes into a CRC
|
|
|
|
|
*
|
|
|
|
|
* FIN_<variant>(crc)
|
|
|
|
|
* Finish a CRC calculation
|
|
|
|
|
*
|
|
|
|
|
* EQ_<variant>(c1, c2)
|
|
|
|
|
* Check for equality of two CRCs.
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-12 20:17:06 -05:00
|
|
|
*
|
2015-04-14 10:03:42 -04:00
|
|
|
* The CRC-32C variant is in port/pg_crc32c.h.
|
|
|
|
|
*
|
2025-01-01 11:21:55 -05:00
|
|
|
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-12 20:17:06 -05:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
|
*
|
2015-04-14 10:03:42 -04:00
|
|
|
* src/include/utils/pg_crc.h
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-12 20:17:06 -05:00
|
|
|
*/
|
|
|
|
|
#ifndef PG_CRC_H
|
|
|
|
|
#define PG_CRC_H
|
|
|
|
|
|
2005-06-02 01:55:29 -04:00
|
|
|
typedef uint32 pg_crc32;
|
|
|
|
|
|
Switch to CRC-32C in WAL and other places.
The old algorithm was found to not be the usual CRC-32 algorithm, used by
Ethernet et al. We were using a non-reflected lookup table with code meant
for a reflected lookup table. That's a strange combination that AFAICS does
not correspond to any bit-wise CRC calculation, which makes it difficult to
reason about its properties. Although it has worked well in practice, seems
safer to use a well-known algorithm.
Since we're changing the algorithm anyway, we might as well choose a
different polynomial. The Castagnoli polynomial has better error-correcting
properties than the traditional CRC-32 polynomial, even if we had
implemented it correctly. Another reason for picking that is that some new
CPUs have hardware support for calculating CRC-32C, but not CRC-32, let
alone our strange variant of it. This patch doesn't add any support for such
hardware, but a future patch could now do that.
The old algorithm is kept around for tsquery and pg_trgm, which use the
values in indexes that need to remain compatible so that pg_upgrade works.
While we're at it, share the old lookup table for CRC-32 calculation
between hstore, ltree and core. They all use the same table, so might as
well.
2014-11-04 04:35:15 -05:00
|
|
|
/*
|
|
|
|
|
* CRC-32, the same used e.g. in Ethernet.
|
|
|
|
|
*
|
|
|
|
|
* This is currently only used in ltree and hstore contrib modules. It uses
|
|
|
|
|
* the same lookup table as the legacy algorithm below. New code should
|
|
|
|
|
* use the Castagnoli version instead.
|
|
|
|
|
*/
|
|
|
|
|
#define INIT_TRADITIONAL_CRC32(crc) ((crc) = 0xFFFFFFFF)
|
|
|
|
|
#define FIN_TRADITIONAL_CRC32(crc) ((crc) ^= 0xFFFFFFFF)
|
|
|
|
|
#define COMP_TRADITIONAL_CRC32(crc, data, len) \
|
|
|
|
|
COMP_CRC32_NORMAL_TABLE(crc, data, len, pg_crc32_table)
|
|
|
|
|
#define EQ_TRADITIONAL_CRC32(c1, c2) ((c1) == (c2))
|
2005-06-02 01:55:29 -04:00
|
|
|
|
2015-02-10 03:54:40 -05:00
|
|
|
/* Sarwate's algorithm, for use with a "normal" lookup table */
|
|
|
|
|
#define COMP_CRC32_NORMAL_TABLE(crc, data, len, table) \
|
|
|
|
|
do { \
|
|
|
|
|
const unsigned char *__data = (const unsigned char *) (data); \
|
|
|
|
|
uint32 __len = (len); \
|
|
|
|
|
\
|
|
|
|
|
while (__len-- > 0) \
|
|
|
|
|
{ \
|
|
|
|
|
int __tab_index = ((int) (crc) ^ *__data++) & 0xFF; \
|
|
|
|
|
(crc) = table[__tab_index] ^ ((crc) >> 8); \
|
|
|
|
|
} \
|
|
|
|
|
} while (0)
|
|
|
|
|
|
Switch to CRC-32C in WAL and other places.
The old algorithm was found to not be the usual CRC-32 algorithm, used by
Ethernet et al. We were using a non-reflected lookup table with code meant
for a reflected lookup table. That's a strange combination that AFAICS does
not correspond to any bit-wise CRC calculation, which makes it difficult to
reason about its properties. Although it has worked well in practice, seems
safer to use a well-known algorithm.
Since we're changing the algorithm anyway, we might as well choose a
different polynomial. The Castagnoli polynomial has better error-correcting
properties than the traditional CRC-32 polynomial, even if we had
implemented it correctly. Another reason for picking that is that some new
CPUs have hardware support for calculating CRC-32C, but not CRC-32, let
alone our strange variant of it. This patch doesn't add any support for such
hardware, but a future patch could now do that.
The old algorithm is kept around for tsquery and pg_trgm, which use the
values in indexes that need to remain compatible so that pg_upgrade works.
While we're at it, share the old lookup table for CRC-32 calculation
between hstore, ltree and core. They all use the same table, so might as
well.
2014-11-04 04:35:15 -05:00
|
|
|
/*
|
|
|
|
|
* The CRC algorithm used for WAL et al in pre-9.5 versions.
|
|
|
|
|
*
|
|
|
|
|
* This closely resembles the normal CRC-32 algorithm, but is subtly
|
|
|
|
|
* different. Using Williams' terms, we use the "normal" table, but with
|
|
|
|
|
* "reflected" code. That's bogus, but it was like that for years before
|
|
|
|
|
* anyone noticed. It does not correspond to any polynomial in a normal CRC
|
|
|
|
|
* algorithm, so it's not clear what the error-detection properties of this
|
|
|
|
|
* algorithm actually are.
|
|
|
|
|
*
|
|
|
|
|
* We still need to carry this around because it is used in a few on-disk
|
|
|
|
|
* structures that need to be pg_upgradeable. It should not be used in new
|
|
|
|
|
* code.
|
|
|
|
|
*/
|
|
|
|
|
#define INIT_LEGACY_CRC32(crc) ((crc) = 0xFFFFFFFF)
|
|
|
|
|
#define FIN_LEGACY_CRC32(crc) ((crc) ^= 0xFFFFFFFF)
|
|
|
|
|
#define COMP_LEGACY_CRC32(crc, data, len) \
|
|
|
|
|
COMP_CRC32_REFLECTED_TABLE(crc, data, len, pg_crc32_table)
|
|
|
|
|
#define EQ_LEGACY_CRC32(c1, c2) ((c1) == (c2))
|
|
|
|
|
|
|
|
|
|
/*
|
2015-02-10 03:54:40 -05:00
|
|
|
* Sarwate's algorithm, for use with a "reflected" lookup table (but in the
|
|
|
|
|
* legacy algorithm, we actually use it on a "normal" table, see above)
|
Switch to CRC-32C in WAL and other places.
The old algorithm was found to not be the usual CRC-32 algorithm, used by
Ethernet et al. We were using a non-reflected lookup table with code meant
for a reflected lookup table. That's a strange combination that AFAICS does
not correspond to any bit-wise CRC calculation, which makes it difficult to
reason about its properties. Although it has worked well in practice, seems
safer to use a well-known algorithm.
Since we're changing the algorithm anyway, we might as well choose a
different polynomial. The Castagnoli polynomial has better error-correcting
properties than the traditional CRC-32 polynomial, even if we had
implemented it correctly. Another reason for picking that is that some new
CPUs have hardware support for calculating CRC-32C, but not CRC-32, let
alone our strange variant of it. This patch doesn't add any support for such
hardware, but a future patch could now do that.
The old algorithm is kept around for tsquery and pg_trgm, which use the
values in indexes that need to remain compatible so that pg_upgrade works.
While we're at it, share the old lookup table for CRC-32 calculation
between hstore, ltree and core. They all use the same table, so might as
well.
2014-11-04 04:35:15 -05:00
|
|
|
*/
|
|
|
|
|
#define COMP_CRC32_REFLECTED_TABLE(crc, data, len, table) \
|
|
|
|
|
do { \
|
|
|
|
|
const unsigned char *__data = (const unsigned char *) (data); \
|
|
|
|
|
uint32 __len = (len); \
|
|
|
|
|
\
|
|
|
|
|
while (__len-- > 0) \
|
|
|
|
|
{ \
|
|
|
|
|
int __tab_index = ((int) ((crc) >> 24) ^ *__data++) & 0xFF; \
|
|
|
|
|
(crc) = table[__tab_index] ^ ((crc) << 8); \
|
|
|
|
|
} \
|
|
|
|
|
} while (0)
|
2005-06-02 01:55:29 -04:00
|
|
|
|
2015-04-14 10:03:42 -04:00
|
|
|
/*
|
|
|
|
|
* Constant table for the CRC-32 polynomials. The same table is used by both
|
|
|
|
|
* the normal and traditional variants.
|
|
|
|
|
*/
|
|
|
|
|
extern PGDLLIMPORT const uint32 pg_crc32_table[256];
|
2005-06-02 01:55:29 -04:00
|
|
|
|
XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.
2001-03-12 20:17:06 -05:00
|
|
|
#endif /* PG_CRC_H */
|