Introduce the REPACK command

REPACK absorbs the functionality of VACUUM FULL and CLUSTER in a single
command.  Because this functionality is completely different from
regular VACUUM, having it separate from VACUUM makes it easier for users
to understand; as for CLUSTER, the term is heavily overloaded in the
IT world and even in Postgres itself, so it's good that we can avoid it.

We retain those older commands, but de-emphasize them in the
documentation, in favor of REPACK; the difference between VACUUM FULL
and CLUSTER (namely, the fact that tuples are written in a specific
ordering) is neatly absorbed as two different modes of REPACK.

This allows us to introduce further functionality in the future that
works regardless of whether an ordering is being applied, such as (and
especially) a concurrent mode.

Author: Antonin Houska <ah@cybertec.at>
Reviewed-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/82651.1720540558@antos
Discussion: https://postgr.es/m/202507262156.sb455angijk6@alvherre.pgsql
This commit is contained in:
Álvaro Herrera 2026-03-10 19:56:39 +01:00
parent a596d27d80
commit ac58465e06
No known key found for this signature in database
GPG key ID: 1C20ACB9D5C564AE
26 changed files with 1668 additions and 560 deletions

View file

@ -413,6 +413,14 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
</entry>
</row>
<row>
<entry><structname>pg_stat_progress_repack</structname><indexterm><primary>pg_stat_progress_repack</primary></indexterm></entry>
<entry>One row for each backend running
<command>REPACK</command>, showing current progress. See
<xref linkend="repack-progress-reporting"/>.
</entry>
</row>
<row>
<entry><structname>pg_stat_progress_basebackup</structname><indexterm><primary>pg_stat_progress_basebackup</primary></indexterm></entry>
<entry>One row for each WAL sender process streaming a base backup,
@ -5796,9 +5804,9 @@ FROM pg_stat_get_backend_idset() AS backendid;
<productname>PostgreSQL</productname> has the ability to report the progress of
certain commands during command execution. Currently, the only commands
which support progress reporting are <command>ANALYZE</command>,
<command>CLUSTER</command>,
<command>CREATE INDEX</command>, <command>VACUUM</command>,
<command>COPY</command>,
<command>COPY</command>, <command>CREATE INDEX</command>,
<command>REPACK</command> (and its obsolete spelling <command>CLUSTER</command>),
<command>VACUUM</command>,
and <xref linkend="protocol-replication-base-backup"/> (i.e., replication
command that <xref linkend="app-pgbasebackup"/> issues to take
a base backup).
@ -6731,6 +6739,218 @@ FROM pg_stat_get_backend_idset() AS backendid;
</sect2>
<sect2 id="repack-progress-reporting">
<title>REPACK Progress Reporting</title>
<indexterm>
<primary>pg_stat_progress_repack</primary>
</indexterm>
<para>
Whenever <command>REPACK</command> is running,
the <structname>pg_stat_progress_repack</structname> view will contain a
row for each backend that is currently running the command. The tables
below describe the information that will be reported and provide
information about how to interpret it.
</para>
<table id="pg-stat-progress-repack-view" xreflabel="pg_stat_progress_repack">
<title><structname>pg_stat_progress_repack</structname> View</title>
<tgroup cols="1">
<thead>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
Column Type
</para>
<para>
Description
</para></entry>
</row>
</thead>
<tbody>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>pid</structfield> <type>integer</type>
</para>
<para>
Process ID of backend.
</para></entry>
</row>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>datid</structfield> <type>oid</type>
</para>
<para>
OID of the database to which this backend is connected.
</para></entry>
</row>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>datname</structfield> <type>name</type>
</para>
<para>
Name of the database to which this backend is connected.
</para></entry>
</row>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>relid</structfield> <type>oid</type>
</para>
<para>
OID of the table being repacked.
</para></entry>
</row>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>phase</structfield> <type>text</type>
</para>
<para>
Current processing phase. See <xref linkend="repack-phases"/>.
</para></entry>
</row>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>repack_index_relid</structfield> <type>oid</type>
</para>
<para>
If the table is being scanned using an index, this is the OID of the
index being used; otherwise, it is zero.
</para></entry>
</row>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>heap_tuples_scanned</structfield> <type>bigint</type>
</para>
<para>
Number of heap tuples scanned.
This counter only advances when the phase is
<literal>seq scanning heap</literal>,
<literal>index scanning heap</literal>
or <literal>writing new heap</literal>.
</para></entry>
</row>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>heap_tuples_written</structfield> <type>bigint</type>
</para>
<para>
Number of heap tuples written.
This counter only advances when the phase is
<literal>seq scanning heap</literal>,
<literal>index scanning heap</literal>
or <literal>writing new heap</literal>.
</para></entry>
</row>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>heap_blks_total</structfield> <type>bigint</type>
</para>
<para>
Total number of heap blocks in the table. This number is reported
as of the beginning of <literal>seq scanning heap</literal>.
</para></entry>
</row>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>heap_blks_scanned</structfield> <type>bigint</type>
</para>
<para>
Number of heap blocks scanned. This counter only advances when the
phase is <literal>seq scanning heap</literal>.
</para></entry>
</row>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>index_rebuild_count</structfield> <type>bigint</type>
</para>
<para>
Number of indexes rebuilt. This counter only advances when the phase
is <literal>rebuilding index</literal>.
</para></entry>
</row>
</tbody>
</tgroup>
</table>
<table id="repack-phases">
<title>REPACK Phases</title>
<tgroup cols="2">
<colspec colname="col1" colwidth="1*"/>
<colspec colname="col2" colwidth="2*"/>
<thead>
<row>
<entry>Phase</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>initializing</literal></entry>
<entry>
The command is preparing to begin scanning the heap. This phase is
expected to be very brief.
</entry>
</row>
<row>
<entry><literal>seq scanning heap</literal></entry>
<entry>
The command is currently scanning the table using a sequential scan.
</entry>
</row>
<row>
<entry><literal>index scanning heap</literal></entry>
<entry>
<command>REPACK</command> is currently scanning the table using an index scan.
</entry>
</row>
<row>
<entry><literal>sorting tuples</literal></entry>
<entry>
<command>REPACK</command> is currently sorting tuples.
</entry>
</row>
<row>
<entry><literal>writing new heap</literal></entry>
<entry>
<command>REPACK</command> is currently writing the new heap.
</entry>
</row>
<row>
<entry><literal>swapping relation files</literal></entry>
<entry>
The command is currently swapping newly-built files into place.
</entry>
</row>
<row>
<entry><literal>rebuilding index</literal></entry>
<entry>
The command is currently rebuilding an index.
</entry>
</row>
<row>
<entry><literal>performing final cleanup</literal></entry>
<entry>
The command is performing final cleanup. When this phase is
completed, <command>REPACK</command> will end.
</entry>
</row>
</tbody>
</tgroup>
</table>
</sect2>
<sect2 id="vacuum-progress-reporting">
<title>VACUUM Progress Reporting</title>

View file

@ -167,6 +167,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY refreshMaterializedView SYSTEM "refresh_materialized_view.sgml">
<!ENTITY reindex SYSTEM "reindex.sgml">
<!ENTITY releaseSavepoint SYSTEM "release_savepoint.sgml">
<!ENTITY repack SYSTEM "repack.sgml">
<!ENTITY reset SYSTEM "reset.sgml">
<!ENTITY revoke SYSTEM "revoke.sgml">
<!ENTITY rollback SYSTEM "rollback.sgml">

View file

@ -33,50 +33,9 @@ CLUSTER [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<title>Description</title>
<para>
<command>CLUSTER</command> instructs <productname>PostgreSQL</productname>
to cluster the table specified
by <replaceable class="parameter">table_name</replaceable>
based on the index specified by
<replaceable class="parameter">index_name</replaceable>. The index must
already have been defined on
<replaceable class="parameter">table_name</replaceable>.
</para>
<para>
When a table is clustered, it is physically reordered
based on the index information. Clustering is a one-time operation:
when the table is subsequently updated, the changes are
not clustered. That is, no attempt is made to store new or
updated rows according to their index order. (If one wishes, one can
periodically recluster by issuing the command again. Also, setting
the table's <literal>fillfactor</literal> storage parameter to less than
100% can aid in preserving cluster ordering during updates, since updated
rows are kept on the same page if enough space is available there.)
</para>
<para>
When a table is clustered, <productname>PostgreSQL</productname>
remembers which index it was clustered by. The form
<command>CLUSTER <replaceable class="parameter">table_name</replaceable></command>
reclusters the table using the same index as before. You can also
use the <literal>CLUSTER</literal> or <literal>SET WITHOUT CLUSTER</literal>
forms of <link linkend="sql-altertable"><command>ALTER TABLE</command></link> to set the index to be used for
future cluster operations, or to clear any previous setting.
</para>
<para>
<command>CLUSTER</command> without a
<replaceable class="parameter">table_name</replaceable> reclusters all the
previously-clustered tables in the current database that the calling user
has privileges for. This form of <command>CLUSTER</command> cannot be
executed inside a transaction block.
</para>
<para>
When a table is being clustered, an <literal>ACCESS
EXCLUSIVE</literal> lock is acquired on it. This prevents any other
database operations (both reads and writes) from operating on the
table until the <command>CLUSTER</command> is finished.
The <command>CLUSTER</command> command is equivalent to
<xref linkend="sql-repack"/> with an <literal>USING INDEX</literal>
clause. See there for more details.
</para>
</refsect1>
@ -136,63 +95,12 @@ CLUSTER [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
on the table.
</para>
<para>
In cases where you are accessing single rows randomly
within a table, the actual order of the data in the
table is unimportant. However, if you tend to access some
data more than others, and there is an index that groups
them together, you will benefit from using <command>CLUSTER</command>.
If you are requesting a range of indexed values from a table, or a
single indexed value that has multiple rows that match,
<command>CLUSTER</command> will help because once the index identifies the
table page for the first row that matches, all other rows
that match are probably already on the same table page,
and so you save disk accesses and speed up the query.
</para>
<para>
<command>CLUSTER</command> can re-sort the table using either an index scan
on the specified index, or (if the index is a b-tree) a sequential
scan followed by sorting. It will attempt to choose the method that
will be faster, based on planner cost parameters and available statistical
information.
</para>
<para>
While <command>CLUSTER</command> is running, the <xref
linkend="guc-search-path"/> is temporarily changed to <literal>pg_catalog,
pg_temp</literal>.
</para>
<para>
When an index scan is used, a temporary copy of the table is created that
contains the table data in the index order. Temporary copies of each
index on the table are created as well. Therefore, you need free space on
disk at least equal to the sum of the table size and the index sizes.
</para>
<para>
When a sequential scan and sort is used, a temporary sort file is
also created, so that the peak temporary space requirement is as much
as double the table size, plus the index sizes. This method is often
faster than the index scan method, but if the disk space requirement is
intolerable, you can disable this choice by temporarily setting <xref
linkend="guc-enable-sort"/> to <literal>off</literal>.
</para>
<para>
It is advisable to set <xref linkend="guc-maintenance-work-mem"/> to
a reasonably large value (but not more than the amount of RAM you can
dedicate to the <command>CLUSTER</command> operation) before clustering.
</para>
<para>
Because the planner records statistics about the ordering of
tables, it is advisable to run <link linkend="sql-analyze"><command>ANALYZE</command></link>
on the newly clustered table.
Otherwise, the planner might make poor choices of query plans.
</para>
<para>
Because <command>CLUSTER</command> remembers which indexes are clustered,
one can cluster the tables one wants clustered manually the first time,
@ -270,6 +178,7 @@ CLUSTER <replaceable class="parameter">index_name</replaceable> ON <replaceable
<title>See Also</title>
<simplelist type="inline">
<member><xref linkend="sql-repack"/></member>
<member><xref linkend="app-clusterdb"/></member>
<member><xref linkend="cluster-progress-reporting"/></member>
</simplelist>

View file

@ -0,0 +1,330 @@
<!--
doc/src/sgml/ref/repack.sgml
PostgreSQL documentation
-->
<refentry id="sql-repack">
<indexterm zone="sql-repack">
<primary>REPACK</primary>
</indexterm>
<refmeta>
<refentrytitle>REPACK</refentrytitle>
<manvolnum>7</manvolnum>
<refmiscinfo>SQL - Language Statements</refmiscinfo>
</refmeta>
<refnamediv>
<refname>REPACK</refname>
<refpurpose>rewrite a table to reclaim disk space</refpurpose>
</refnamediv>
<refsynopsisdiv>
<synopsis>
REPACK [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <replaceable class="parameter">table_and_columns</replaceable> [ USING INDEX [ <replaceable class="parameter">index_name</replaceable> ] ] ]
REPACK [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] USING INDEX
<phrase>where <replaceable class="parameter">option</replaceable> can be one of:</phrase>
VERBOSE [ <replaceable class="parameter">boolean</replaceable> ]
ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
<replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
<refsect1>
<title>Description</title>
<para>
<command>REPACK</command> reclaims storage occupied by dead
tuples. Unlike <command>VACUUM</command>, it does so by rewriting the
entire contents of the table specified
by <replaceable class="parameter">table_name</replaceable> into a new disk
file with no extra space (except for the space guaranteed by
the <literal>fillfactor</literal> storage parameter), allowing unused space
to be returned to the operating system.
</para>
<para>
Without
a <replaceable class="parameter">table_name</replaceable>, <command>REPACK</command>
processes every table and materialized view in the current database that
the current user has the <literal>MAINTAIN</literal> privilege on. This
form of <command>REPACK</command> cannot be executed inside a transaction
block.
</para>
<para>
If a <literal>USING INDEX</literal> clause is specified, the rows are
physically reordered based on information from an index. Please see the
notes on clustering below.
</para>
<para>
When a table is being repacked, an <literal>ACCESS EXCLUSIVE</literal> lock
is acquired on it. This prevents any other database operations (both reads
and writes) from operating on the table until the <command>REPACK</command>
is finished.
</para>
<refsect2 id="sql-repack-notes-on-clustering" xreflabel="Notes on Clustering">
<title>Notes on Clustering</title>
<para>
If the <literal>USING INDEX</literal> clause is specified, the rows in
the table are stored in the order that the index specifies;
<firstterm>clustering</firstterm>, because rows are physically clustered
afterwards.
If an index name is specified in the command, the order implied by that
index is used, and that index is configured as the index to cluster on.
(This also applies to an index given to the <command>CLUSTER</command>
command.)
If no index name is specified, then the index that has
been configured as the index to cluster on is used; an
error is thrown if none has.
An index can be set manually using <command>ALTER TABLE ... CLUSTER ON</command>,
and reset with <command>ALTER TABLE ... SET WITHOUT CLUSTER</command>.
</para>
<para>
If no table name is specified in <command>REPACK USING INDEX</command>,
all tables which have a clustering index defined and which the calling
user has privileges for are processed.
</para>
<para>
Clustering is a one-time operation: when the table is
subsequently updated, the changes are not clustered. That is, no attempt
is made to store new or updated rows according to their index order. (If
one wishes, one can periodically recluster by issuing the command again.
Also, setting the table's <literal>fillfactor</literal> storage parameter
to less than 100% can aid in preserving cluster ordering during updates,
since updated rows are kept on the same page if enough space is available
there.)
</para>
<para>
In cases where you are accessing single rows randomly within a table, the
actual order of the data in the table is unimportant. However, if you tend
to access some data more than others, and there is an index that groups
them together, you will benefit from using clustering. If
you are requesting a range of indexed values from a table, or a single
indexed value that has multiple rows that match,
clustering will help because once the index identifies the
table page for the first row that matches, all other rows that match are
probably already on the same table page, and so you save disk accesses and
speed up the query.
</para>
<para>
<command>REPACK</command> can re-sort the table using either an index scan
on the specified index (if the index is a b-tree), or a sequential scan
followed by sorting. It will attempt to choose the method that will be
faster, based on planner cost parameters and available statistical
information.
</para>
<para>
Because the planner records statistics about the ordering of tables, it is
advisable to
run <link linkend="sql-analyze"><command>ANALYZE</command></link> on the
newly repacked table. Otherwise, the planner might make poor choices of
query plans.
</para>
</refsect2>
<refsect2 id="sql-repack-notes-on-resources" xreflabel="Notes on Resources">
<title>Notes on Resources</title>
<para>
When an index scan or a sequential scan without sort is used, a temporary
copy of the table is created that contains the table data in the index
order. Temporary copies of each index on the table are created as well.
Therefore, you need free space on disk at least equal to the sum of the
table size and the index sizes.
</para>
<para>
When a sequential scan and sort is used, a temporary sort file is also
created, so that the peak temporary space requirement is as much as double
the table size, plus the index sizes. This method is often faster than
the index scan method, but if the disk space requirement is intolerable,
you can disable this choice by temporarily setting
<xref linkend="guc-enable-sort"/> to <literal>off</literal>.
</para>
<para>
It is advisable to set <xref linkend="guc-maintenance-work-mem"/> to a
reasonably large value (but not more than the amount of RAM you can
dedicate to the <command>REPACK</command> operation) before repacking.
</para>
</refsect2>
</refsect1>
<refsect1>
<title>Parameters</title>
<variablelist>
<varlistentry>
<term><replaceable class="parameter">table_name</replaceable></term>
<listitem>
<para>
The name (possibly schema-qualified) of a table.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">column_name</replaceable></term>
<listitem>
<para>
The name of a specific column to analyze. Defaults to all columns.
If a column list is specific, <literal>ANALYZE</literal> must also
be specified.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">index_name</replaceable></term>
<listitem>
<para>
The name of an index.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>VERBOSE</literal></term>
<listitem>
<para>
Prints a progress report as each table is repacked
at <literal>INFO</literal> level.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>ANALYZE</literal></term>
<term><literal>ANALYSE</literal></term>
<listitem>
<para>
Applies <xref linkend="sql-analyze"/> on the table after repacking. This is
currently only supported when a single (non-partitioned) table is specified.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">boolean</replaceable></term>
<listitem>
<para>
Specifies whether the selected option should be turned on or off.
You can write <literal>TRUE</literal>, <literal>ON</literal>, or
<literal>1</literal> to enable the option, and <literal>FALSE</literal>,
<literal>OFF</literal>, or <literal>0</literal> to disable it. The
<replaceable class="parameter">boolean</replaceable> value can also
be omitted, in which case <literal>TRUE</literal> is assumed.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Notes</title>
<para>
To repack a table, one must have the <literal>MAINTAIN</literal> privilege
on the table.
</para>
<para>
While <command>REPACK</command> is running, the <xref
linkend="guc-search-path"/> is temporarily changed to <literal>pg_catalog,
pg_temp</literal>.
</para>
<para>
Each backend running <command>REPACK</command> will report its progress
in the <structname>pg_stat_progress_repack</structname> view. See
<xref linkend="repack-progress-reporting"/> for details.
</para>
<para>
Repacking a partitioned table repacks each of its partitions. If an index
is specified, each partition is repacked using the partition of that
index. <command>REPACK</command> on a partitioned table cannot be executed
inside a transaction block.
</para>
</refsect1>
<refsect1>
<title>Examples</title>
<para>
Repack the table <literal>employees</literal>:
<programlisting>
REPACK employees;
</programlisting>
</para>
<para>
Repack the table <literal>employees</literal> on the basis of its
index <literal>employees_ind</literal> (Since index is used here, this is
effectively clustering):
<programlisting>
REPACK employees USING INDEX employees_ind;
</programlisting>
</para>
<para>
Repack the table <literal>cases</literal> on physical ordering,
running an <command>ANALYZE</command> on the given columns once
repacking is done, showing informational messages:
<programlisting>
REPACK (ANALYZE, VERBOSE) cases (district, case_nr);
</programlisting>
</para>
<para>
Repack all tables in the database on which you have
the <literal>MAINTAIN</literal> privilege:
<programlisting>
REPACK;
</programlisting>
</para>
<para>
Repack all tables for which a clustering index has previously been
configured on which you have the <literal>MAINTAIN</literal> privilege,
showing informational messages:
<programlisting>
REPACK (VERBOSE) USING INDEX;
</programlisting>
</para>
</refsect1>
<refsect1>
<title>Compatibility</title>
<para>
There is no <command>REPACK</command> statement in the SQL standard.
</para>
</refsect1>
<refsect1>
<title>See Also</title>
<simplelist type="inline">
<member><xref linkend="repack-progress-reporting"/></member>
</simplelist>
</refsect1>
</refentry>

View file

@ -25,7 +25,6 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<phrase>where <replaceable class="parameter">option</replaceable> can be one of:</phrase>
FULL [ <replaceable class="parameter">boolean</replaceable> ]
FREEZE [ <replaceable class="parameter">boolean</replaceable> ]
VERBOSE [ <replaceable class="parameter">boolean</replaceable> ]
ANALYZE [ <replaceable class="parameter">boolean</replaceable> ]
@ -39,6 +38,7 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
SKIP_DATABASE_STATS [ <replaceable class="parameter">boolean</replaceable> ]
ONLY_DATABASE_STATS [ <replaceable class="parameter">boolean</replaceable> ]
BUFFER_USAGE_LIMIT <replaceable class="parameter">size</replaceable>
FULL [ <replaceable class="parameter">boolean</replaceable> ]
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
@ -95,20 +95,6 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<title>Parameters</title>
<variablelist>
<varlistentry>
<term><literal>FULL</literal></term>
<listitem>
<para>
Selects <quote>full</quote> vacuum, which can reclaim more
space, but takes much longer and exclusively locks the table.
This method also requires extra disk space, since it writes a
new copy of the table and doesn't release the old copy until
the operation is complete. Usually this should only be used when a
significant amount of space needs to be reclaimed from within the table.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>FREEZE</literal></term>
<listitem>
@ -362,6 +348,23 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
</listitem>
</varlistentry>
<varlistentry>
<term><literal>FULL</literal></term>
<listitem>
<para>
This option, which is deprecated, makes <command>VACUUM</command>
behave like <command>REPACK</command> without a
<literal>USING INDEX</literal> clause.
This method of compacting the table takes much longer than
<command>VACUUM</command> and exclusively locks the table.
This method also requires extra disk space, since it writes a
new copy of the table and doesn't release the old copy until
the operation is complete. Usually this should only be used when a
significant amount of space needs to be reclaimed from within the table.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">boolean</replaceable></term>
<listitem>

View file

@ -195,6 +195,7 @@
&refreshMaterializedView;
&reindex;
&releaseSavepoint;
&repack;
&reset;
&revoke;
&rollback;

View file

@ -741,13 +741,13 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
if (OldIndex != NULL && !use_sort)
{
const int ci_index[] = {
PROGRESS_CLUSTER_PHASE,
PROGRESS_CLUSTER_INDEX_RELID
PROGRESS_REPACK_PHASE,
PROGRESS_REPACK_INDEX_RELID
};
int64 ci_val[2];
/* Set phase and OIDOldIndex to columns */
ci_val[0] = PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP;
ci_val[0] = PROGRESS_REPACK_PHASE_INDEX_SCAN_HEAP;
ci_val[1] = RelationGetRelid(OldIndex);
pgstat_progress_update_multi_param(2, ci_index, ci_val);
@ -759,15 +759,15 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
else
{
/* In scan-and-sort mode and also VACUUM FULL, set phase */
pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
pgstat_progress_update_param(PROGRESS_REPACK_PHASE,
PROGRESS_REPACK_PHASE_SEQ_SCAN_HEAP);
tableScan = table_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
heapScan = (HeapScanDesc) tableScan;
indexScan = NULL;
/* Set total heap blocks */
pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_BLKS,
pgstat_progress_update_param(PROGRESS_REPACK_TOTAL_HEAP_BLKS,
heapScan->rs_nblocks);
}
@ -809,7 +809,7 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
* is manually updated to the correct value when the table
* scan finishes.
*/
pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
pgstat_progress_update_param(PROGRESS_REPACK_HEAP_BLKS_SCANNED,
heapScan->rs_nblocks);
break;
}
@ -825,7 +825,7 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
*/
if (prev_cblock != heapScan->rs_cblock)
{
pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_BLKS_SCANNED,
pgstat_progress_update_param(PROGRESS_REPACK_HEAP_BLKS_SCANNED,
(heapScan->rs_cblock +
heapScan->rs_nblocks -
heapScan->rs_startblock
@ -926,14 +926,14 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
* In scan-and-sort mode, report increase in number of tuples
* scanned
*/
pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
pgstat_progress_update_param(PROGRESS_REPACK_HEAP_TUPLES_SCANNED,
*num_tuples);
}
else
{
const int ct_index[] = {
PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED,
PROGRESS_CLUSTER_HEAP_TUPLES_WRITTEN
PROGRESS_REPACK_HEAP_TUPLES_SCANNED,
PROGRESS_REPACK_HEAP_TUPLES_WRITTEN
};
int64 ct_val[2];
@ -966,14 +966,14 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
double n_tuples = 0;
/* Report that we are now sorting tuples */
pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
PROGRESS_CLUSTER_PHASE_SORT_TUPLES);
pgstat_progress_update_param(PROGRESS_REPACK_PHASE,
PROGRESS_REPACK_PHASE_SORT_TUPLES);
tuplesort_performsort(tuplesort);
/* Report that we are now writing new heap */
pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP);
pgstat_progress_update_param(PROGRESS_REPACK_PHASE,
PROGRESS_REPACK_PHASE_WRITE_NEW_HEAP);
for (;;)
{
@ -991,7 +991,7 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
values, isnull,
rwstate);
/* Report n_tuples */
pgstat_progress_update_param(PROGRESS_CLUSTER_HEAP_TUPLES_WRITTEN,
pgstat_progress_update_param(PROGRESS_REPACK_HEAP_TUPLES_WRITTEN,
n_tuples);
}

View file

@ -4077,7 +4077,7 @@ reindex_relation(const ReindexStmt *stmt, Oid relid, int flags,
Assert(!ReindexIsProcessingIndex(indexOid));
/* Set index rebuild count */
pgstat_progress_update_param(PROGRESS_CLUSTER_INDEX_REBUILD_COUNT,
pgstat_progress_update_param(PROGRESS_REPACK_INDEX_REBUILD_COUNT,
i);
i++;
}

View file

@ -1311,14 +1311,15 @@ CREATE VIEW pg_stat_progress_vacuum AS
FROM pg_stat_get_progress_info('VACUUM') AS S
LEFT JOIN pg_database D ON S.datid = D.oid;
CREATE VIEW pg_stat_progress_cluster AS
CREATE VIEW pg_stat_progress_repack AS
SELECT
S.pid AS pid,
S.datid AS datid,
D.datname AS datname,
S.relid AS relid,
CASE S.param1 WHEN 1 THEN 'CLUSTER'
WHEN 2 THEN 'VACUUM FULL'
WHEN 2 THEN 'REPACK'
WHEN 3 THEN 'VACUUM FULL'
END AS command,
CASE S.param2 WHEN 0 THEN 'initializing'
WHEN 1 THEN 'seq scanning heap'
@ -1329,15 +1330,35 @@ CREATE VIEW pg_stat_progress_cluster AS
WHEN 6 THEN 'rebuilding index'
WHEN 7 THEN 'performing final cleanup'
END AS phase,
CAST(S.param3 AS oid) AS cluster_index_relid,
CAST(S.param3 AS oid) AS repack_index_relid,
S.param4 AS heap_tuples_scanned,
S.param5 AS heap_tuples_written,
S.param6 AS heap_blks_total,
S.param7 AS heap_blks_scanned,
S.param8 AS index_rebuild_count
FROM pg_stat_get_progress_info('CLUSTER') AS S
FROM pg_stat_get_progress_info('REPACK') AS S
LEFT JOIN pg_database D ON S.datid = D.oid;
-- This view is as the one above, except for renaming a column and avoiding
-- 'REPACK' as a command name to report.
CREATE VIEW pg_stat_progress_cluster AS
SELECT
pid,
datid,
datname,
relid,
CASE WHEN command IN ('CLUSTER', 'VACUUM FULL') THEN command
WHEN repack_index_relid = 0 THEN 'VACUUM FULL'
ELSE 'CLUSTER' END AS command,
phase,
repack_index_relid AS cluster_index_relid,
heap_tuples_scanned,
heap_tuples_written,
heap_blks_total,
heap_blks_scanned,
index_rebuild_count
FROM pg_stat_progress_repack;
CREATE VIEW pg_stat_progress_create_index AS
SELECT
S.pid AS pid, S.datid AS datid, D.datname AS datname,

File diff suppressed because it is too large Load diff

View file

@ -352,7 +352,6 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
}
}
/*
* Sanity check DISABLE_PAGE_SKIPPING option.
*/
@ -2294,8 +2293,9 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams params,
if ((params.options & VACOPT_VERBOSE) != 0)
cluster_params.options |= CLUOPT_VERBOSE;
/* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
cluster_rel(rel, InvalidOid, &cluster_params);
/* VACUUM FULL is a variant of REPACK; see cluster.c */
cluster_rel(REPACK_COMMAND_VACUUMFULL, rel, InvalidOid,
&cluster_params);
/* cluster_rel closes the relation, but keeps lock */
rel = NULL;

View file

@ -288,7 +288,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
AlterCompositeTypeStmt AlterUserMappingStmt
AlterRoleStmt AlterRoleSetStmt AlterPolicyStmt AlterStatsStmt
AlterDefaultPrivilegesStmt DefACLAction
AnalyzeStmt CallStmt ClosePortalStmt ClusterStmt CommentStmt
AnalyzeStmt CallStmt ClosePortalStmt CommentStmt
ConstraintsSetStmt CopyStmt CreateAsStmt CreateCastStmt
CreateDomainStmt CreateExtensionStmt CreateGroupStmt CreateOpClassStmt
CreateOpFamilyStmt AlterOpFamilyStmt CreatePLangStmt
@ -305,7 +305,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
GrantStmt GrantRoleStmt ImportForeignSchemaStmt IndexStmt InsertStmt
ListenStmt LoadStmt LockStmt MergeStmt NotifyStmt ExplainableStmt PreparableStmt
CreateFunctionStmt AlterFunctionStmt ReindexStmt RemoveAggrStmt
RemoveFuncStmt RemoveOperStmt RenameStmt ReturnStmt RevokeStmt RevokeRoleStmt
RemoveFuncStmt RemoveOperStmt RenameStmt RepackStmt ReturnStmt RevokeStmt RevokeRoleStmt
RuleActionStmt RuleActionStmtOrEmpty RuleStmt
SecLabelStmt SelectStmt TransactionStmt TransactionStmtLegacy TruncateStmt
UnlistenStmt UpdateStmt VacuumStmt
@ -324,7 +324,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <str> opt_single_name
%type <list> opt_qualified_name
%type <boolean> opt_concurrently
%type <boolean> opt_concurrently opt_usingindex
%type <dbehavior> opt_drop_behavior
%type <list> opt_utility_option_list
%type <list> opt_wait_with_clause
@ -776,7 +776,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
QUOTE QUOTES
RANGE READ REAL REASSIGN RECURSIVE REF_P REFERENCES REFERENCING
REFRESH REINDEX RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
REFRESH REINDEX RELATIVE_P RELEASE RENAME REPACK REPEATABLE REPLACE REPLICA
RESET RESPECT_P RESTART RESTRICT RETURN RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROLLUP
ROUTINE ROUTINES ROW ROWS RULE
@ -1038,7 +1038,6 @@ stmt:
| CallStmt
| CheckPointStmt
| ClosePortalStmt
| ClusterStmt
| CommentStmt
| ConstraintsSetStmt
| CopyStmt
@ -1112,6 +1111,7 @@ stmt:
| RemoveFuncStmt
| RemoveOperStmt
| RenameStmt
| RepackStmt
| RevokeStmt
| RevokeRoleStmt
| RuleStmt
@ -1149,6 +1149,11 @@ opt_concurrently:
| /*EMPTY*/ { $$ = false; }
;
opt_usingindex:
USING INDEX { $$ = true; }
| /* EMPTY */ { $$ = false; }
;
opt_drop_behavior:
CASCADE { $$ = DROP_CASCADE; }
| RESTRICT { $$ = DROP_RESTRICT; }
@ -12085,38 +12090,82 @@ CreateConversionStmt:
/*****************************************************************************
*
* QUERY:
* REPACK [ (options) ] [ <qualified_name> [ <name_list> ] [ USING INDEX <index_name> ] ]
*
* obsolete variants:
* CLUSTER (options) [ <qualified_name> [ USING <index_name> ] ]
* CLUSTER [VERBOSE] [ <qualified_name> [ USING <index_name> ] ]
* CLUSTER [VERBOSE] <index_name> ON <qualified_name> (for pre-8.3)
*
*****************************************************************************/
ClusterStmt:
CLUSTER '(' utility_option_list ')' qualified_name cluster_index_specification
RepackStmt:
REPACK opt_utility_option_list vacuum_relation USING INDEX name
{
ClusterStmt *n = makeNode(ClusterStmt);
RepackStmt *n = makeNode(RepackStmt);
n->relation = $5;
n->command = REPACK_COMMAND_REPACK;
n->relation = (VacuumRelation *) $3;
n->indexname = $6;
n->usingindex = true;
n->params = $2;
$$ = (Node *) n;
}
| REPACK opt_utility_option_list vacuum_relation opt_usingindex
{
RepackStmt *n = makeNode(RepackStmt);
n->command = REPACK_COMMAND_REPACK;
n->relation = (VacuumRelation *) $3;
n->indexname = NULL;
n->usingindex = $4;
n->params = $2;
$$ = (Node *) n;
}
| REPACK opt_utility_option_list opt_usingindex
{
RepackStmt *n = makeNode(RepackStmt);
n->command = REPACK_COMMAND_REPACK;
n->relation = NULL;
n->indexname = NULL;
n->usingindex = $3;
n->params = $2;
$$ = (Node *) n;
}
| CLUSTER '(' utility_option_list ')' qualified_name cluster_index_specification
{
RepackStmt *n = makeNode(RepackStmt);
n->command = REPACK_COMMAND_CLUSTER;
n->relation = makeNode(VacuumRelation);
n->relation->relation = $5;
n->indexname = $6;
n->usingindex = true;
n->params = $3;
$$ = (Node *) n;
}
| CLUSTER opt_utility_option_list
{
ClusterStmt *n = makeNode(ClusterStmt);
RepackStmt *n = makeNode(RepackStmt);
n->command = REPACK_COMMAND_CLUSTER;
n->relation = NULL;
n->indexname = NULL;
n->usingindex = true;
n->params = $2;
$$ = (Node *) n;
}
/* unparenthesized VERBOSE kept for pre-14 compatibility */
| CLUSTER opt_verbose qualified_name cluster_index_specification
{
ClusterStmt *n = makeNode(ClusterStmt);
RepackStmt *n = makeNode(RepackStmt);
n->relation = $3;
n->command = REPACK_COMMAND_CLUSTER;
n->relation = makeNode(VacuumRelation);
n->relation->relation = $3;
n->indexname = $4;
n->usingindex = true;
if ($2)
n->params = list_make1(makeDefElem("verbose", NULL, @2));
$$ = (Node *) n;
@ -12124,20 +12173,25 @@ ClusterStmt:
/* unparenthesized VERBOSE kept for pre-17 compatibility */
| CLUSTER VERBOSE
{
ClusterStmt *n = makeNode(ClusterStmt);
RepackStmt *n = makeNode(RepackStmt);
n->command = REPACK_COMMAND_CLUSTER;
n->relation = NULL;
n->indexname = NULL;
n->usingindex = true;
n->params = list_make1(makeDefElem("verbose", NULL, @2));
$$ = (Node *) n;
}
/* kept for pre-8.3 compatibility */
| CLUSTER opt_verbose name ON qualified_name
{
ClusterStmt *n = makeNode(ClusterStmt);
RepackStmt *n = makeNode(RepackStmt);
n->relation = $5;
n->command = REPACK_COMMAND_CLUSTER;
n->relation = makeNode(VacuumRelation);
n->relation->relation = $5;
n->indexname = $3;
n->usingindex = true;
if ($2)
n->params = list_make1(makeDefElem("verbose", NULL, @2));
$$ = (Node *) n;
@ -18194,6 +18248,7 @@ unreserved_keyword:
| RELATIVE_P
| RELEASE
| RENAME
| REPACK
| REPEATABLE
| REPLACE
| REPLICA
@ -18831,6 +18886,7 @@ bare_label_keyword:
| RELATIVE_P
| RELEASE
| RENAME
| REPACK
| REPEATABLE
| REPLACE
| REPLICA

View file

@ -279,9 +279,9 @@ ClassifyUtilityCommandAsReadOnly(Node *parsetree)
return COMMAND_OK_IN_RECOVERY | COMMAND_OK_IN_READ_ONLY_TXN;
}
case T_ClusterStmt:
case T_ReindexStmt:
case T_VacuumStmt:
case T_RepackStmt:
{
/*
* These commands write WAL, so they're not strictly
@ -290,9 +290,9 @@ ClassifyUtilityCommandAsReadOnly(Node *parsetree)
*
* However, they don't change the database state in a way that
* would affect pg_dump output, so it's fine to run them in a
* read-only transaction. (CLUSTER might change the order of
* rows on disk, which could affect the ordering of pg_dump
* output, but that's not semantically significant.)
* read-only transaction. (REPACK/CLUSTER might change the
* order of rows on disk, which could affect the ordering of
* pg_dump output, but that's not semantically significant.)
*/
return COMMAND_OK_IN_READ_ONLY_TXN;
}
@ -856,14 +856,14 @@ standard_ProcessUtility(PlannedStmt *pstmt,
ExecuteCallStmt(castNode(CallStmt, parsetree), params, isAtomicContext, dest);
break;
case T_ClusterStmt:
cluster(pstate, (ClusterStmt *) parsetree, isTopLevel);
break;
case T_VacuumStmt:
ExecVacuum(pstate, (VacuumStmt *) parsetree, isTopLevel);
break;
case T_RepackStmt:
ExecRepack(pstate, (RepackStmt *) parsetree, isTopLevel);
break;
case T_ExplainStmt:
ExplainQuery(pstate, (ExplainStmt *) parsetree, params, dest);
break;
@ -2865,10 +2865,6 @@ CreateCommandTag(Node *parsetree)
tag = CMDTAG_CALL;
break;
case T_ClusterStmt:
tag = CMDTAG_CLUSTER;
break;
case T_VacuumStmt:
if (((VacuumStmt *) parsetree)->is_vacuumcmd)
tag = CMDTAG_VACUUM;
@ -2876,6 +2872,13 @@ CreateCommandTag(Node *parsetree)
tag = CMDTAG_ANALYZE;
break;
case T_RepackStmt:
if (((RepackStmt *) parsetree)->command == REPACK_COMMAND_CLUSTER)
tag = CMDTAG_CLUSTER;
else
tag = CMDTAG_REPACK;
break;
case T_ExplainStmt:
tag = CMDTAG_EXPLAIN;
break;
@ -3517,7 +3520,7 @@ GetCommandLogLevel(Node *parsetree)
lev = LOGSTMT_ALL;
break;
case T_ClusterStmt:
case T_RepackStmt:
lev = LOGSTMT_DDL;
break;

View file

@ -288,8 +288,8 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
cmdtype = PROGRESS_COMMAND_VACUUM;
else if (pg_strcasecmp(cmd, "ANALYZE") == 0)
cmdtype = PROGRESS_COMMAND_ANALYZE;
else if (pg_strcasecmp(cmd, "CLUSTER") == 0)
cmdtype = PROGRESS_COMMAND_CLUSTER;
else if (pg_strcasecmp(cmd, "REPACK") == 0)
cmdtype = PROGRESS_COMMAND_REPACK;
else if (pg_strcasecmp(cmd, "CREATE INDEX") == 0)
cmdtype = PROGRESS_COMMAND_CREATE_INDEX;
else if (pg_strcasecmp(cmd, "BASEBACKUP") == 0)

View file

@ -1267,7 +1267,7 @@ static const char *const sql_commands[] = {
"DELETE FROM", "DISCARD", "DO", "DROP", "END", "EXECUTE", "EXPLAIN",
"FETCH", "GRANT", "IMPORT FOREIGN SCHEMA", "INSERT INTO", "LISTEN", "LOAD", "LOCK",
"MERGE INTO", "MOVE", "NOTIFY", "PREPARE",
"REASSIGN", "REFRESH MATERIALIZED VIEW", "REINDEX", "RELEASE",
"REASSIGN", "REFRESH MATERIALIZED VIEW", "REINDEX", "RELEASE", "REPACK",
"RESET", "REVOKE", "ROLLBACK",
"SAVEPOINT", "SECURITY LABEL", "SELECT", "SET", "SHOW", "START",
"TABLE", "TRUNCATE", "UNLISTEN", "UPDATE", "VACUUM", "VALUES",
@ -5117,6 +5117,47 @@ match_previous_words(int pattern_id,
COMPLETE_WITH_QUERY(Query_for_list_of_tablespaces);
}
/* REPACK */
else if (Matches("REPACK"))
COMPLETE_WITH_SCHEMA_QUERY_PLUS(Query_for_list_of_clusterables,
"(", "USING INDEX");
else if (Matches("REPACK", "(*)"))
COMPLETE_WITH_SCHEMA_QUERY_PLUS(Query_for_list_of_clusterables,
"USING INDEX");
else if (Matches("REPACK", MatchAnyExcept("(")))
COMPLETE_WITH("USING INDEX");
else if (Matches("REPACK", "(*)", MatchAnyExcept("(")))
COMPLETE_WITH("USING INDEX");
else if (Matches("REPACK", MatchAny, "USING", "INDEX") ||
Matches("REPACK", "(*)", MatchAny, "USING", "INDEX"))
{
set_completion_reference(prev3_wd);
COMPLETE_WITH_SCHEMA_QUERY(Query_for_index_of_table);
}
/*
* Complete ... [ (*) ] <sth> USING INDEX, with a list of indexes for
* <sth>.
*/
else if (TailMatches(MatchAny, "USING", "INDEX"))
{
set_completion_reference(prev3_wd);
COMPLETE_WITH_SCHEMA_QUERY(Query_for_index_of_table);
}
else if (HeadMatches("REPACK", "(*") &&
!HeadMatches("REPACK", "(*)"))
{
/*
* This fires if we're in an unfinished parenthesized option list.
* get_previous_words treats a completed parenthesized option list as
* one word, so the above test is correct.
*/
if (ends_with(prev_wd, '(') || ends_with(prev_wd, ','))
COMPLETE_WITH("ANALYZE", "VERBOSE");
else if (TailMatches("ANALYZE", "VERBOSE"))
COMPLETE_WITH("ON", "OFF");
}
/* SECURITY LABEL */
else if (Matches("SECURITY"))
COMPLETE_WITH("LABEL");

View file

@ -57,6 +57,6 @@
*/
/* yyyymmddN */
#define CATALOG_VERSION_NO 202603062
#define CATALOG_VERSION_NO 202603101
#endif

View file

@ -24,6 +24,7 @@
#define CLUOPT_RECHECK 0x02 /* recheck relation state */
#define CLUOPT_RECHECK_ISCLUSTERED 0x04 /* recheck relation state for
* indisclustered */
#define CLUOPT_ANALYZE 0x08 /* do an ANALYZE */
/* options for CLUSTER */
typedef struct ClusterParams
@ -31,8 +32,11 @@ typedef struct ClusterParams
bits32 options; /* bitmask of CLUOPT_* */
} ClusterParams;
extern void cluster(ParseState *pstate, ClusterStmt *stmt, bool isTopLevel);
extern void cluster_rel(Relation OldHeap, Oid indexOid, ClusterParams *params);
extern void ExecRepack(ParseState *pstate, RepackStmt *stmt, bool isTopLevel);
extern void cluster_rel(RepackCommand command, Relation OldHeap, Oid indexOid,
ClusterParams *params);
extern void check_index_is_clusterable(Relation OldHeap, Oid indexOid,
LOCKMODE lockmode);
extern void mark_index_clustered(Relation rel, Oid indexOid, bool is_internal);

View file

@ -73,28 +73,34 @@
#define PROGRESS_ANALYZE_STARTED_BY_MANUAL 1
#define PROGRESS_ANALYZE_STARTED_BY_AUTOVACUUM 2
/* Progress parameters for cluster */
#define PROGRESS_CLUSTER_COMMAND 0
#define PROGRESS_CLUSTER_PHASE 1
#define PROGRESS_CLUSTER_INDEX_RELID 2
#define PROGRESS_CLUSTER_HEAP_TUPLES_SCANNED 3
#define PROGRESS_CLUSTER_HEAP_TUPLES_WRITTEN 4
#define PROGRESS_CLUSTER_TOTAL_HEAP_BLKS 5
#define PROGRESS_CLUSTER_HEAP_BLKS_SCANNED 6
#define PROGRESS_CLUSTER_INDEX_REBUILD_COUNT 7
/*
* Progress parameters for REPACK.
*
* Values for PROGRESS_REPACK_COMMAND are as in RepackCommand.
*
* Note: Since REPACK shares code with CLUSTER, these values are also
* used by CLUSTER. (CLUSTER being now deprecated, it makes little sense to
* maintain a separate set of constants.)
*/
#define PROGRESS_REPACK_COMMAND 0
#define PROGRESS_REPACK_PHASE 1
#define PROGRESS_REPACK_INDEX_RELID 2
#define PROGRESS_REPACK_HEAP_TUPLES_SCANNED 3
#define PROGRESS_REPACK_HEAP_TUPLES_WRITTEN 4
#define PROGRESS_REPACK_TOTAL_HEAP_BLKS 5
#define PROGRESS_REPACK_HEAP_BLKS_SCANNED 6
#define PROGRESS_REPACK_INDEX_REBUILD_COUNT 7
/* Phases of cluster (as advertised via PROGRESS_CLUSTER_PHASE) */
#define PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP 1
#define PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP 2
#define PROGRESS_CLUSTER_PHASE_SORT_TUPLES 3
#define PROGRESS_CLUSTER_PHASE_WRITE_NEW_HEAP 4
#define PROGRESS_CLUSTER_PHASE_SWAP_REL_FILES 5
#define PROGRESS_CLUSTER_PHASE_REBUILD_INDEX 6
#define PROGRESS_CLUSTER_PHASE_FINAL_CLEANUP 7
/* Commands of PROGRESS_CLUSTER */
#define PROGRESS_CLUSTER_COMMAND_CLUSTER 1
#define PROGRESS_CLUSTER_COMMAND_VACUUM_FULL 2
/*
* Phases of repack (as advertised via PROGRESS_REPACK_PHASE).
*/
#define PROGRESS_REPACK_PHASE_SEQ_SCAN_HEAP 1
#define PROGRESS_REPACK_PHASE_INDEX_SCAN_HEAP 2
#define PROGRESS_REPACK_PHASE_SORT_TUPLES 3
#define PROGRESS_REPACK_PHASE_WRITE_NEW_HEAP 4
#define PROGRESS_REPACK_PHASE_SWAP_REL_FILES 5
#define PROGRESS_REPACK_PHASE_REBUILD_INDEX 6
#define PROGRESS_REPACK_PHASE_FINAL_CLEANUP 7
/* Progress parameters for CREATE INDEX */
/* 3, 4 and 5 reserved for "waitfor" metrics */

View file

@ -3982,18 +3982,6 @@ typedef struct AlterSystemStmt
VariableSetStmt *setstmt; /* SET subcommand */
} AlterSystemStmt;
/* ----------------------
* Cluster Statement (support pbrown's cluster index implementation)
* ----------------------
*/
typedef struct ClusterStmt
{
NodeTag type;
RangeVar *relation; /* relation being indexed, or NULL if all */
char *indexname; /* original index defined */
List *params; /* list of DefElem nodes */
} ClusterStmt;
/* ----------------------
* Vacuum and Analyze Statements
*
@ -4006,7 +3994,7 @@ typedef struct VacuumStmt
NodeTag type;
List *options; /* list of DefElem nodes */
List *rels; /* list of VacuumRelation, or NIL for all */
bool is_vacuumcmd; /* true for VACUUM, false for ANALYZE */
bool is_vacuumcmd; /* true for VACUUM, false otherwise */
} VacuumStmt;
/*
@ -4024,6 +4012,27 @@ typedef struct VacuumRelation
List *va_cols; /* list of column names, or NIL for all */
} VacuumRelation;
/* ----------------------
* Repack Statement
* ----------------------
*/
typedef enum RepackCommand
{
REPACK_COMMAND_CLUSTER = 1,
REPACK_COMMAND_REPACK,
REPACK_COMMAND_VACUUMFULL,
} RepackCommand;
typedef struct RepackStmt
{
NodeTag type;
RepackCommand command; /* type of command being run */
VacuumRelation *relation; /* relation being repacked */
char *indexname; /* order tuples by this index */
bool usingindex; /* whether USING INDEX is specified */
List *params; /* list of DefElem nodes */
} RepackStmt;
/* ----------------------
* Explain Statement
*

View file

@ -377,6 +377,7 @@ PG_KEYWORD("reindex", REINDEX, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("relative", RELATIVE_P, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("release", RELEASE, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("rename", RENAME, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("repack", REPACK, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD, BARE_LABEL)

View file

@ -196,6 +196,7 @@ PG_CMDTAG(CMDTAG_REASSIGN_OWNED, "REASSIGN OWNED", false, false, false)
PG_CMDTAG(CMDTAG_REFRESH_MATERIALIZED_VIEW, "REFRESH MATERIALIZED VIEW", true, false, false)
PG_CMDTAG(CMDTAG_REINDEX, "REINDEX", true, false, false)
PG_CMDTAG(CMDTAG_RELEASE, "RELEASE", false, false, false)
PG_CMDTAG(CMDTAG_REPACK, "REPACK", false, false, false)
PG_CMDTAG(CMDTAG_RESET, "RESET", false, false, false)
PG_CMDTAG(CMDTAG_REVOKE, "REVOKE", true, false, false)
PG_CMDTAG(CMDTAG_REVOKE_ROLE, "REVOKE ROLE", false, false, false)

View file

@ -24,10 +24,10 @@ typedef enum ProgressCommandType
PROGRESS_COMMAND_INVALID,
PROGRESS_COMMAND_VACUUM,
PROGRESS_COMMAND_ANALYZE,
PROGRESS_COMMAND_CLUSTER,
PROGRESS_COMMAND_CREATE_INDEX,
PROGRESS_COMMAND_BASEBACKUP,
PROGRESS_COMMAND_COPY,
PROGRESS_COMMAND_REPACK,
} ProgressCommandType;
#define PGSTAT_NUM_PROGRESS_PARAM 20

View file

@ -495,6 +495,46 @@ ALTER TABLE clstrpart SET WITHOUT CLUSTER;
ERROR: cannot mark index clustered in partitioned table
ALTER TABLE clstrpart CLUSTER ON clstrpart_idx;
ERROR: cannot mark index clustered in partitioned table
-- and they cannot get an index-ordered REPACK without an explicit index name
REPACK clstrpart USING INDEX;
ERROR: cannot execute REPACK on partitioned table "clstrpart" USING INDEX with no index name
-- Check that REPACK sets new relfilenodes: it should process exactly the same
-- tables as CLUSTER did.
DROP TABLE old_cluster_info;
DROP TABLE new_cluster_info;
CREATE TEMP TABLE old_cluster_info AS SELECT relname, level, relfilenode, relkind FROM pg_partition_tree('clstrpart'::regclass) AS tree JOIN pg_class c ON c.oid=tree.relid ;
REPACK clstrpart USING INDEX clstrpart_idx;
CREATE TEMP TABLE new_cluster_info AS SELECT relname, level, relfilenode, relkind FROM pg_partition_tree('clstrpart'::regclass) AS tree JOIN pg_class c ON c.oid=tree.relid ;
SELECT relname, old.level, old.relkind, old.relfilenode = new.relfilenode FROM old_cluster_info AS old JOIN new_cluster_info AS new USING (relname) ORDER BY relname COLLATE "C";
relname | level | relkind | ?column?
-------------+-------+---------+----------
clstrpart | 0 | p | t
clstrpart1 | 1 | p | t
clstrpart11 | 2 | r | f
clstrpart12 | 2 | p | t
clstrpart2 | 1 | r | f
clstrpart3 | 1 | p | t
clstrpart33 | 2 | r | f
(7 rows)
-- And finally the same for REPACK w/o index.
DROP TABLE old_cluster_info;
DROP TABLE new_cluster_info;
CREATE TEMP TABLE old_cluster_info AS SELECT relname, level, relfilenode, relkind FROM pg_partition_tree('clstrpart'::regclass) AS tree JOIN pg_class c ON c.oid=tree.relid ;
REPACK clstrpart;
CREATE TEMP TABLE new_cluster_info AS SELECT relname, level, relfilenode, relkind FROM pg_partition_tree('clstrpart'::regclass) AS tree JOIN pg_class c ON c.oid=tree.relid ;
SELECT relname, old.level, old.relkind, old.relfilenode = new.relfilenode FROM old_cluster_info AS old JOIN new_cluster_info AS new USING (relname) ORDER BY relname COLLATE "C";
relname | level | relkind | ?column?
-------------+-------+---------+----------
clstrpart | 0 | p | t
clstrpart1 | 1 | p | t
clstrpart11 | 2 | r | f
clstrpart12 | 2 | p | t
clstrpart2 | 1 | r | f
clstrpart3 | 1 | p | t
clstrpart33 | 2 | r | f
(7 rows)
DROP TABLE clstrpart;
-- Ownership of partitions is checked
CREATE TABLE ptnowner(i int unique) PARTITION BY LIST (i);
@ -513,7 +553,7 @@ CREATE TEMP TABLE ptnowner_oldnodes AS
JOIN pg_class AS c ON c.oid=tree.relid;
SET SESSION AUTHORIZATION regress_ptnowner;
CLUSTER ptnowner USING ptnowner_i_idx;
WARNING: permission denied to cluster "ptnowner2", skipping it
WARNING: permission denied to execute CLUSTER on "ptnowner2", skipping it
RESET SESSION AUTHORIZATION;
SELECT a.relname, a.relfilenode=b.relfilenode FROM pg_class a
JOIN ptnowner_oldnodes b USING (oid) ORDER BY a.relname COLLATE "C";
@ -665,6 +705,101 @@ SELECT * FROM clstr_expression WHERE -a = -3 ORDER BY -a, b;
(4 rows)
COMMIT;
----------------------------------------------------------------------
--
-- REPACK
--
----------------------------------------------------------------------
-- REPACK handles individual tables identically to CLUSTER, but it's worth
-- checking if it handles table hierarchies identically as well.
REPACK clstr_tst USING INDEX clstr_tst_c;
-- Verify that inheritance link still works
INSERT INTO clstr_tst_inh VALUES (0, 100, 'in child table 2');
SELECT a,b,c,substring(d for 30), length(d) from clstr_tst;
a | b | c | substring | length
----+-----+------------------+--------------------------------+--------
10 | 14 | catorce | |
18 | 5 | cinco | |
9 | 4 | cuatro | |
26 | 19 | diecinueve | |
12 | 18 | dieciocho | |
30 | 16 | dieciseis | |
24 | 17 | diecisiete | |
2 | 10 | diez | |
23 | 12 | doce | |
11 | 2 | dos | |
25 | 9 | nueve | |
31 | 8 | ocho | |
1 | 11 | once | |
28 | 15 | quince | |
32 | 6 | seis | xyzzyxyzzyxyzzyxyzzyxyzzyxyzzy | 500000
29 | 7 | siete | |
15 | 13 | trece | |
22 | 30 | treinta | |
17 | 32 | treinta y dos | |
3 | 31 | treinta y uno | |
5 | 3 | tres | |
20 | 1 | uno | |
6 | 20 | veinte | |
14 | 25 | veinticinco | |
21 | 24 | veinticuatro | |
4 | 22 | veintidos | |
19 | 29 | veintinueve | |
16 | 28 | veintiocho | |
27 | 26 | veintiseis | |
13 | 27 | veintisiete | |
7 | 23 | veintitres | |
8 | 21 | veintiuno | |
0 | 100 | in child table | |
0 | 100 | in child table 2 | |
(34 rows)
-- Verify that foreign key link still works
INSERT INTO clstr_tst (b, c) VALUES (1111, 'this should fail');
ERROR: insert or update on table "clstr_tst" violates foreign key constraint "clstr_tst_con"
DETAIL: Key (b)=(1111) is not present in table "clstr_tst_s".
SELECT conname FROM pg_constraint WHERE conrelid = 'clstr_tst'::regclass
ORDER BY 1;
conname
----------------------
clstr_tst_a_not_null
clstr_tst_con
clstr_tst_pkey
(3 rows)
-- Verify partial analyze works
REPACK (ANALYZE) clstr_tst (a);
REPACK (ANALYZE) clstr_tst;
REPACK (VERBOSE) clstr_tst (a);
ERROR: ANALYZE option must be specified when a column list is provided
-- REPACK w/o argument performs no ordering, so we can only check which tables
-- have the relfilenode changed.
RESET SESSION AUTHORIZATION;
CREATE TEMP TABLE relnodes_old AS
(SELECT relname, relfilenode
FROM pg_class
WHERE relname IN ('clstr_1', 'clstr_2', 'clstr_3'));
SET SESSION AUTHORIZATION regress_clstr_user;
SET client_min_messages = ERROR; -- order of "skipping" warnings may vary
REPACK;
RESET client_min_messages;
RESET SESSION AUTHORIZATION;
CREATE TEMP TABLE relnodes_new AS
(SELECT relname, relfilenode
FROM pg_class
WHERE relname IN ('clstr_1', 'clstr_2', 'clstr_3'));
-- Do the actual comparison. Unlike CLUSTER, clstr_3 should have been
-- processed because there is nothing like clustering index here.
SELECT o.relname FROM relnodes_old o
JOIN relnodes_new n ON o.relname = n.relname
WHERE o.relfilenode <> n.relfilenode
ORDER BY o.relname;
relname
---------
clstr_1
clstr_3
(2 rows)
-- clean up
DROP TABLE clustertest;
DROP TABLE clstr_1;

View file

@ -2002,34 +2002,23 @@ pg_stat_progress_basebackup| SELECT pid,
ELSE NULL::text
END AS backup_type
FROM pg_stat_get_progress_info('BASEBACKUP'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20);
pg_stat_progress_cluster| SELECT s.pid,
s.datid,
d.datname,
s.relid,
CASE s.param1
WHEN 1 THEN 'CLUSTER'::text
WHEN 2 THEN 'VACUUM FULL'::text
ELSE NULL::text
pg_stat_progress_cluster| SELECT pid,
datid,
datname,
relid,
CASE
WHEN (command = ANY (ARRAY['CLUSTER'::text, 'VACUUM FULL'::text])) THEN command
WHEN (repack_index_relid = (0)::oid) THEN 'VACUUM FULL'::text
ELSE 'CLUSTER'::text
END AS command,
CASE s.param2
WHEN 0 THEN 'initializing'::text
WHEN 1 THEN 'seq scanning heap'::text
WHEN 2 THEN 'index scanning heap'::text
WHEN 3 THEN 'sorting tuples'::text
WHEN 4 THEN 'writing new heap'::text
WHEN 5 THEN 'swapping relation files'::text
WHEN 6 THEN 'rebuilding index'::text
WHEN 7 THEN 'performing final cleanup'::text
ELSE NULL::text
END AS phase,
(s.param3)::oid AS cluster_index_relid,
s.param4 AS heap_tuples_scanned,
s.param5 AS heap_tuples_written,
s.param6 AS heap_blks_total,
s.param7 AS heap_blks_scanned,
s.param8 AS index_rebuild_count
FROM (pg_stat_get_progress_info('CLUSTER'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
LEFT JOIN pg_database d ON ((s.datid = d.oid)));
phase,
repack_index_relid AS cluster_index_relid,
heap_tuples_scanned,
heap_tuples_written,
heap_blks_total,
heap_blks_scanned,
index_rebuild_count
FROM pg_stat_progress_repack;
pg_stat_progress_copy| SELECT s.pid,
s.datid,
d.datname,
@ -2089,6 +2078,35 @@ pg_stat_progress_create_index| SELECT s.pid,
s.param15 AS partitions_done
FROM (pg_stat_get_progress_info('CREATE INDEX'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
LEFT JOIN pg_database d ON ((s.datid = d.oid)));
pg_stat_progress_repack| SELECT s.pid,
s.datid,
d.datname,
s.relid,
CASE s.param1
WHEN 1 THEN 'CLUSTER'::text
WHEN 2 THEN 'REPACK'::text
WHEN 3 THEN 'VACUUM FULL'::text
ELSE NULL::text
END AS command,
CASE s.param2
WHEN 0 THEN 'initializing'::text
WHEN 1 THEN 'seq scanning heap'::text
WHEN 2 THEN 'index scanning heap'::text
WHEN 3 THEN 'sorting tuples'::text
WHEN 4 THEN 'writing new heap'::text
WHEN 5 THEN 'swapping relation files'::text
WHEN 6 THEN 'rebuilding index'::text
WHEN 7 THEN 'performing final cleanup'::text
ELSE NULL::text
END AS phase,
(s.param3)::oid AS repack_index_relid,
s.param4 AS heap_tuples_scanned,
s.param5 AS heap_tuples_written,
s.param6 AS heap_blks_total,
s.param7 AS heap_blks_scanned,
s.param8 AS index_rebuild_count
FROM (pg_stat_get_progress_info('REPACK'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
LEFT JOIN pg_database d ON ((s.datid = d.oid)));
pg_stat_progress_vacuum| SELECT s.pid,
s.datid,
d.datname,

View file

@ -76,7 +76,6 @@ INSERT INTO clstr_tst (b, c) VALUES (1111, 'this should fail');
SELECT conname FROM pg_constraint WHERE conrelid = 'clstr_tst'::regclass
ORDER BY 1;
SELECT relname, relkind,
EXISTS(SELECT 1 FROM pg_class WHERE oid = c.reltoastrelid) AS hastoast
FROM pg_class c WHERE relname LIKE 'clstr_tst%' ORDER BY relname;
@ -229,6 +228,26 @@ SELECT relname, old.level, old.relkind, old.relfilenode = new.relfilenode FROM o
CLUSTER clstrpart;
ALTER TABLE clstrpart SET WITHOUT CLUSTER;
ALTER TABLE clstrpart CLUSTER ON clstrpart_idx;
-- and they cannot get an index-ordered REPACK without an explicit index name
REPACK clstrpart USING INDEX;
-- Check that REPACK sets new relfilenodes: it should process exactly the same
-- tables as CLUSTER did.
DROP TABLE old_cluster_info;
DROP TABLE new_cluster_info;
CREATE TEMP TABLE old_cluster_info AS SELECT relname, level, relfilenode, relkind FROM pg_partition_tree('clstrpart'::regclass) AS tree JOIN pg_class c ON c.oid=tree.relid ;
REPACK clstrpart USING INDEX clstrpart_idx;
CREATE TEMP TABLE new_cluster_info AS SELECT relname, level, relfilenode, relkind FROM pg_partition_tree('clstrpart'::regclass) AS tree JOIN pg_class c ON c.oid=tree.relid ;
SELECT relname, old.level, old.relkind, old.relfilenode = new.relfilenode FROM old_cluster_info AS old JOIN new_cluster_info AS new USING (relname) ORDER BY relname COLLATE "C";
-- And finally the same for REPACK w/o index.
DROP TABLE old_cluster_info;
DROP TABLE new_cluster_info;
CREATE TEMP TABLE old_cluster_info AS SELECT relname, level, relfilenode, relkind FROM pg_partition_tree('clstrpart'::regclass) AS tree JOIN pg_class c ON c.oid=tree.relid ;
REPACK clstrpart;
CREATE TEMP TABLE new_cluster_info AS SELECT relname, level, relfilenode, relkind FROM pg_partition_tree('clstrpart'::regclass) AS tree JOIN pg_class c ON c.oid=tree.relid ;
SELECT relname, old.level, old.relkind, old.relfilenode = new.relfilenode FROM old_cluster_info AS old JOIN new_cluster_info AS new USING (relname) ORDER BY relname COLLATE "C";
DROP TABLE clstrpart;
-- Ownership of partitions is checked
@ -313,6 +332,57 @@ EXPLAIN (COSTS OFF) SELECT * FROM clstr_expression WHERE -a = -3 ORDER BY -a, b;
SELECT * FROM clstr_expression WHERE -a = -3 ORDER BY -a, b;
COMMIT;
----------------------------------------------------------------------
--
-- REPACK
--
----------------------------------------------------------------------
-- REPACK handles individual tables identically to CLUSTER, but it's worth
-- checking if it handles table hierarchies identically as well.
REPACK clstr_tst USING INDEX clstr_tst_c;
-- Verify that inheritance link still works
INSERT INTO clstr_tst_inh VALUES (0, 100, 'in child table 2');
SELECT a,b,c,substring(d for 30), length(d) from clstr_tst;
-- Verify that foreign key link still works
INSERT INTO clstr_tst (b, c) VALUES (1111, 'this should fail');
SELECT conname FROM pg_constraint WHERE conrelid = 'clstr_tst'::regclass
ORDER BY 1;
-- Verify partial analyze works
REPACK (ANALYZE) clstr_tst (a);
REPACK (ANALYZE) clstr_tst;
REPACK (VERBOSE) clstr_tst (a);
-- REPACK w/o argument performs no ordering, so we can only check which tables
-- have the relfilenode changed.
RESET SESSION AUTHORIZATION;
CREATE TEMP TABLE relnodes_old AS
(SELECT relname, relfilenode
FROM pg_class
WHERE relname IN ('clstr_1', 'clstr_2', 'clstr_3'));
SET SESSION AUTHORIZATION regress_clstr_user;
SET client_min_messages = ERROR; -- order of "skipping" warnings may vary
REPACK;
RESET client_min_messages;
RESET SESSION AUTHORIZATION;
CREATE TEMP TABLE relnodes_new AS
(SELECT relname, relfilenode
FROM pg_class
WHERE relname IN ('clstr_1', 'clstr_2', 'clstr_3'));
-- Do the actual comparison. Unlike CLUSTER, clstr_3 should have been
-- processed because there is nothing like clustering index here.
SELECT o.relname FROM relnodes_old o
JOIN relnodes_new n ON o.relname = n.relname
WHERE o.relfilenode <> n.relfilenode
ORDER BY o.relname;
-- clean up
DROP TABLE clustertest;
DROP TABLE clstr_1;

View file

@ -2581,6 +2581,8 @@ ReorderBufferTupleCidEnt
ReorderBufferTupleCidKey
ReorderBufferUpdateProgressTxnCB
ReorderTuple
RepackCommand
RepackStmt
ReparameterizeForeignPathByChild_function
ReplOriginId
ReplOriginXactState