postgresql/src/include/catalog/catversion.h
Amit Kapila ce0fdbfe97 Allow multiple xacts during table sync in logical replication.
For the initial table data synchronization in logical replication, we use
a single transaction to copy the entire table and then synchronize the
position in the stream with the main apply worker.

There are multiple downsides of this approach: (a) We have to perform the
entire copy operation again if there is any error (network breakdown,
error in the database operation, etc.) while we synchronize the WAL
position between tablesync worker and apply worker; this will be onerous
especially for large copies, (b) Using a single transaction in the
synchronization-phase (where we can receive WAL from multiple
transactions) will have the risk of exceeding the CID limit, (c) The slot
will hold the WAL till the entire sync is complete because we never commit
till the end.

This patch solves all the above downsides by allowing multiple
transactions during the tablesync phase. The initial copy is done in a
single transaction and after that, we commit each transaction as we
receive. To allow recovery after any error or crash, we use a permanent
slot and origin to track the progress. The slot and origin will be removed
once we finish the synchronization of the table. We also remove slot and
origin of tablesync workers if the user performs DROP SUBSCRIPTION .. or
ALTER SUBSCRIPTION .. REFERESH and some of the table syncs are still not
finished.

The commands ALTER SUBSCRIPTION ... REFRESH PUBLICATION and
ALTER SUBSCRIPTION ... SET PUBLICATION ... with refresh option as true
cannot be executed inside a transaction block because they can now drop
the slots for which we have no provision to rollback.

This will also open up the path for logical replication of 2PC
transactions on the subscriber side. Previously, we can't do that because
of the requirement of maintaining a single transaction in tablesync
workers.

Bump catalog version due to change of state in the catalog
(pg_subscription_rel).

Author: Peter Smith, Amit Kapila, and Takamichi Osumi
Reviewed-by: Ajin Cherian, Petr Jelinek, Hou Zhijie and Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1KHJxaZS-fod-0fey=0tq3=Gkn4ho=8N4-5HWiCfu0H1A@mail.gmail.com
2021-02-12 07:41:51 +05:30

58 lines
2.5 KiB
C

/*-------------------------------------------------------------------------
*
* catversion.h
* "Catalog version number" for PostgreSQL.
*
* The catalog version number is used to flag incompatible changes in
* the PostgreSQL system catalogs. Whenever anyone changes the format of
* a system catalog relation, or adds, deletes, or modifies standard
* catalog entries in such a way that an updated backend wouldn't work
* with an old database (or vice versa), the catalog version number
* should be changed. The version number stored in pg_control by initdb
* is checked against the version number compiled into the backend at
* startup time, so that a backend can refuse to run in an incompatible
* database.
*
* The point of this feature is to provide a finer grain of compatibility
* checking than is possible from looking at the major version number
* stored in PG_VERSION. It shouldn't matter to end users, but during
* development cycles we usually make quite a few incompatible changes
* to the contents of the system catalogs, and we don't want to bump the
* major version number for each one. What we can do instead is bump
* this internal version number. This should save some grief for
* developers who might otherwise waste time tracking down "bugs" that
* are really just code-vs-database incompatibilities.
*
* The rule for developers is: if you commit a change that requires
* an initdb, you should update the catalog version number (as well as
* notifying the pgsql-hackers mailing list, which has been the
* informal practice for a long time).
*
* The catalog version number is placed here since modifying files in
* include/catalog is the most common kind of initdb-forcing change.
* But it could be used to protect any kind of incompatible change in
* database contents or layout, such as altering tuple headers.
*
*
* Portions Copyright (c) 1996-2021, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* src/include/catalog/catversion.h
*
*-------------------------------------------------------------------------
*/
#ifndef CATVERSION_H
#define CATVERSION_H
/*
* We could use anything we wanted for version numbers, but I recommend
* following the "YYYYMMDDN" style often used for DNS zone serial numbers.
* YYYYMMDD are the date of the change, and N is the number of the change
* on that day. (Hopefully we'll never commit ten independent sets of
* catalog changes on the same day...)
*/
/* yyyymmddN */
#define CATALOG_VERSION_NO 202102121
#endif