10.5. Incremental Backup
In normal operation, backups need to be taken regularly. However, multiple full backups consume huge amounts of storage space.
To address this issue, PostgreSQL introduced incremental backups in version 17. An incremental backup saves only the changed portions of the files taken in the preceding backup.
In the following subsections, we will explore the overview of incremental backups, the process of creating incremental backups with pg_basebackup, and the data format of incremental backups.
10.5.1. Incremental Backup Overview
10.5.1.1. Taking Incremental Backup
Incremental backup is performed based on full backup and WAL summary files collected by WALsummarizer, described in Section 9.6.2. Therefore, we begin to explain incremental backup from taking a full backup.
-
Taking the full backup:
After issuing do_pg_backup_start(), a full backup is taken, containing all relation files. We assume the REDO point of the full backup is $REDO_{full}$. -
From full backup to incremental backup:
The WAL Summarizer process tracks changes to all database blocks and writes these modifications to WAL summary files. -
Taking the incremental backup:
We assume the REDO point of this incremental backup is $REDO_{inc01}$. Using the summary files generated between $REDO_{full}$ and $REDO_{inc01}$, the incremental backup files, which contain only the changed blocks, are backed up instead of the entire relation files.
Two important points to remember:
-
In incremental backups, relation and visibility map files are backed up as incremental backup files, while free-space map files are always entirely backed up. This is because, as mentioned in Section 9.6.2, free-space map forks are not properly tracked by the WAL Summarizer.
-
If relations are created after taking the preceding backup, the entire relation files are backed up.
Incremental backup files are named using the following convention:
INCREMENTAL.{oid}
INCREMENTAL.{oid}_vm
For instance, the incremental backup file for Table t1 (OID = 16551) is named ‘INCREMENTAL.16551’.
Here’s an example of the full and incremental backup files for t1:
$ ls -la -h backup/full/base/16425/ | grep "16551$"
-rw------- 1 postgres postgres 32K Oct 17 11:11 16551
16551
$ ls -la -h backup/inc01/base/16425/ | grep "16551$"
-rw------- 1 postgres postgres 24K Oct 17 11:20 INCREMENTAL.16551
INCREMENTAL.16551
The INCREMENTAL file format is described in Section 10.5.3.
10.5.1.2. Reconstructing Backup
To reconstruct a base backup from incremental backups, PostgreSQL provides pg_combinebackup utility.
Assuming that a full backup is located at ‘/usr/local/pgsql/backup/full’ and an incremental backup at ‘/usr/local/pgsql/backup/inc01’, we can reconstruct the base backup under ‘/usr/local/pgsql/reconstructed’ by issuing the following command:
$ pg_combinebackup -d -n -o /usr/local/pgsql/reconstructed \
> /usr/local/pgsql/backup/full/ \
> /usr/local/pgsql/backup/inc01/
Figure 10.8 illustrates how the pg_combinebackup utility reconstructs a base backup from a full backup and an incremental backup. In essence, pg_combinebackup applies the file changes stored in the incremental backup file to the original relation file.
For instance, assuming the relation file for Table t1 (OID=16551) is stored in the full backup and the incremental backup file named ‘INCREMENTAL.16551’ is in the incremental backup, pg_combinebackup overwrites the changed blocks (e.g., the 0th and 3rd blocks) stored in ‘INCREMENTAL.16551’ onto the relation file named ‘16551’.
10.5.2. How to Take Incremental Backups
To take an incremental backup, use the ‘–incremental’ option with the path to the preceding backup’s manifest file.
For example, to take an incremental backup from host 192.168.1.10 to the local server at ‘/usr/local/pgsql/backup/inc01’, based on the full backup located at ‘/usr/local/pgsql/backup/full’, we can execute the following command:
$ pg_basebackup -h 192.168.1.10 -p 5432 \
> --incremental /usr/local/pgsql/backup/full/backup_manifest \
> -D /usr/local/pgsql/backup/inc01 -X stream -P -v
Fig. 10.9 shows the sequence of how the pg_basebackup takes an incremental backup:
- Connection request:
- Create walsender process:
- Send backup manifest file:
pg_basebackup sends the backup_manifest of the preceding backup. - Invoke FinalizeIncrementalManifest():
The walsender process retrieves only two values, TimeLine and Start-LSN, from the backup manifest file. - Base backup request:
- Do do_pg_backup_start():
- Invoke PrepareForIncrementalbackup():
Using the summary files from the Start-LSN retrieved in step 4 to the latest REDO point generated in step 6, the walsender process creates a list of changes to all relation blocks. - Send INCREMENTAL files:
Using the change list created in step 7, the walsender process sends the incremental backup files, which are composed of a header block and the changed blocks, instead of the entire relation and visibility-map files. - Do do_pg_backup_stop():
- Send WAL files:
- Send backup_manifest file:
Note that, in step 8, the walsender sends the entire relation file if the relation is created after taking the preceding backup.
When taking the next incremental backup, we can execute the following command, setting the ‘–incremental’ option to the previous incremental backup manifest ‘/usr/local/pgsql/backup/inc01/backup_manifest’:
$ pg_basebackup -h 192.168.1.10 -p 5432 \
> --incremental /usr/local/pgsql/backup/inc01/backup_manifest \
> -D /usr/local/pgsql/backup/inc02 -X stream -P -v
10.5.3. Format of INCREMENTAL File
An INCREMENTAL file is typically composed of a header block and modified blocks, each block being 8 KB in size.
A header block contains four types of information:
- Magic Number: The header begins with “0xd3ae1f0d” to identify it as an incremental backup file.
- num_incremental_block: The number of modified blocks.
- truncation_block_length: In most cases, this is the total number of blocks in the table and VM that correspond to this file. However, depending on the conditions, it may be larger than this value due to internal processing. See the source code for details.
- List of modified block numbers: A list of modified block numbers. For instance, if 0th and 3rd blocks are modified, “0” and “3” are stored.
A header block is typically 8 KB in size, or a multiple of 8 KB. Normally, a header block is padded with zeros to reach an 8 KB alignment after the list of modified block numbers. If the list of modified block numbers exceeds 8 KB, additional header blocks are added.
If a table is truncated, dropped, or unchanged, its header block becomes a 12-byte block containing three fields: a magic number, a number of incremental blocks set to 0, and a truncation block length of 1. The INCREMENTAL file itself also becomes a 12-byte file containing only the header.
Fig. 10.11 shows four examples of the INCREMENTAL files that are explained in Section 9.6.2.2: