6.1. Outline of Concurrent VACUUM

Vacuum processing performs the following tasks for specified tables or all tables in the database:

  1. Removing dead tuples

    • Remove dead tuples and defragment live tuples for each page.
    • Remove index tuples that point to dead tuples.
  2. Freezing old txids

    • Freeze old txids of tuples if necessary.
    • Update frozen txid related system catalogs (pg_database and pg_class).
    • Remove unnecessary parts of the clog if possible.
  3. Others

    • Update the FSM and VM of processed tables.
    • Update several statistics (pg_stat_all_tables, etc).

It is assumed that readers are familiar with following terms: dead tuples, freezing txid, FSM, and the clog; if you are not, refer to Chapter 5. VM is introduced in Section 6.2.

The following pseudocode describes vacuum processing.

Code Pseudocode: Concurrent VACUUM
(1)    FOR each table
(2)      Acquire a ShareUpdateExclusiveLock lock for the target table

         /* The first block */
(3)      Scan all pages to get all dead tuples, and freeze old tuples if necessary
(4)      Remove the index tuples that point to the respective dead tuples if exists

         /* The second block */
(5)      FOR each page of the table
(6)         Remove the dead tuples, and Reallocate the live tuples in the page
(7)         Update FSM and VM
         END FOR

         /* The third block */
(8)      Clean up indexes
(9)      Truncate the last page if possible
(10)     Update both the statistics and system catalogs of the target table
            Release the ShareUpdateExclusiveLock lock
      END FOR

      /* Post-processing */
(11)  Update statistics and system catalogs
(12)  Remove both unnecessary files and pages of the clog if possible
  1. Get each table from the specified tables.
  2. Acquire a ShareUpdateExclusiveLock lock for the table. This lock allows reading from other transactions.
  3. Scan all pages to get all dead tuples, and freeze old tuples if necessary.
  4. Remove the index tuples that point to the respective dead tuples if exists.
  5. Do the following tasks, step (6) and (7), for each page of the table.
  6. Remove the dead tuples and Reallocate the live tuples in the page.
  7. Update both the respective FSM and VM of the target table.
  8. Clean up the indexes by the index_vacuum_cleanup()@indexam.c function.
  9. Truncate the last page if the last one does not have any tuple.
  10. Update both the statistics and the system catalogs related to vacuum processing for the target table.
  11. Update both the statistics and the system catalogs related to vacuum processing.
  12. Remove both unnecessary files and pages of the clog if possible.

This pseudocode has two sections: a loop for each table and post-processing. The inner loop can be divided into three blocks. Each block has individual tasks.

These three blocks and the post-process are outlined in the following.

PARALLEL option

The VACUUM command has supported the PARALLEL option since version 13. If this option is set and there are multiple indexes created, the vacuuming index and cleaning index up phases are processed in parallel.

Note that this feature is only valid for the VACUUM command and is not supported by autovacuum.

6.1.1. First Block

This block performs freeze processing and removes index tuples that point to dead tuples.

First, PostgreSQL scans a target table to build a list of dead tuples and freeze old tuples if possible. The list is stored in the local memory called maintenance_work_mem. Freeze processing is described in Section 6.3.

After scanning, PostgreSQL removes index tuples by referring to the dead tuple list. This process is internally called the “cleanup stage”. It is a costly process, so PostgreSQL was improved in version 11.

In versions 10 or earlier, the cleanup stage is always executed. In versions 11 or later, if the target index is B-tree, whether the cleanup stage is executed or not is decided by the configuration parameter vacuum_cleanup_index_scale_factor.

See the description of this parameter in details.

If maintenance_work_mem is full and scanning is incomplete, PostgreSQL proceeds to the next tasks, i.e. steps (4) to (7). Then, it goes back to step (3) and proceeds remainder scanning.

6.1.2. Second Block

This block removes dead tuples and updates both the FSM and VM on a page-by-page basis. Figure 6.1 shows an example:

Figure 6.1. Removing a dead tuple.

Assume that the table contains three pages. We focus on the 0th page (i.e., the first page). This page has three tuples. Tuple_2 is a dead tuple (Figure 6.1(1)). In this case, PostgreSQL removes Tuple 2 and reorders the remaining tuples to repair fragmentation. Then, it updates both the FSM and VM of this page (Figure 6.1(2)). PostgreSQL continues this process until the last page.

Note that unnecessary line pointers are not removed. They will be reused in the future. This is because if line pointers are removed, all index tuples of the associated indexes must be updated.

6.1.3. Third Block

The third block performs the cleanup after the deletion of the indexes, and also updates the statistics and system catalogs related to vacuum processing for each target table.

Moreover, if the last page has no tuples, it is truncated from the table file.

6.1.4. Post-processing

When vacuum processing is complete, PostgreSQL updates all the statistics and system catalogs related to vacuum processing. It also removes unnecessary parts of the clog if possible (Section 6.4).

Ring Buffer

Vacuum processing uses a ring buffer, described in Section 8.5. Therefore, processed pages are not cached in the shared buffers.