6.6. Full VACUUM
Although Concurrent VACUUM is essential for operation, it is not sufficient. For example, it cannot reduce the size of a table even if many dead tuples are removed.
Figure 6.8 shows an extreme example. Suppose that a table consists of three pages, and each page contains six tuples. The following DELETE command is executed to remove tuples, and the VACUUM command is executed to remove dead tuples:
testdb=# DELETE FROM tbl WHERE id % 6 != 0;
testdb=# VACUUM tbl;
The dead tuples are removed; but the table size is not reduced. This is both a waste of disk space and has a negative impact on database performance. For instance, in the above example, when three tuples in the table are read, three pages must be loaded from disk.
To deal with this situation, PostgreSQL provides the Full VACUUM mode. Figure 6.9 shows an outline of this mode.
-
[1] Create new table file: Figure 6.9(1)
When the VACUUM FULL command is executed for a table, PostgreSQL first acquires the AccessExclusiveLock lock for the table and creates a new table file whose size is 8 KB. The AccessExclusiveLock lock prevents other users from accessing the table. -
[2] Copy live tuples to the new table: Figure 6.9(2)
PostgreSQL copies only live tuples within the old table file to the new table. -
[3] Remove the old file, rebuild indexes, and update the statistics, FSM, and VM: Figure 6.9(3)
After copying all live tuples, PostgreSQL removes the old file, rebuilds all associated table indexes, updates both the FSM and VM of this table, and updates associated statistics and system catalogs.
The pseudocode of the Full VACUUM is shown in below:
(1) FOR each table
(2) Acquire AccessExclusiveLock lock for the table
(3) Create a new table file
(4) FOR each live tuple in the old table
(5) Copy the live tuple to the new table file
(6) Freeze the tuple IF necessary
END FOR
(7) Remove the old table file
(8) Rebuild all indexes
(9) Update FSM and VM
(10) Update statistics
Release AccessExclusiveLock lock
END FOR
(11) Remove unnecessary clog files and pages if possible
Two points should be considered when using the VACUUM FULL command:
-
Nobody can access(read/write) the table when Full VACUUM is processing.
-
At most twice the disk space of the table is used temporarily; therefore, it is necessary to check the remaining disk capacity when a huge table is processed.
There is no universal answer to the question of when to execute VACUUM FULL. However, the pg_freespacemap extension can provide useful insights.
The following query calculates the average free space ratio for a specific table:
testdb=# CREATE EXTENSION pg_freespacemap;
CREATE EXTENSION
testdb=# SELECT count(*) as "number of pages",
pg_size_pretty(cast(avg(avail) as bigint)) as "Av. freespace size",
round(100 * avg(avail)/8192 ,2) as "Av. freespace ratio"
FROM pg_freespace('accounts');
number of pages | Av. freespace size | Av. freespace ratio
-----------------+--------------------+---------------------
1640 | 99 bytes | 1.21
(1 row)
The result indicates minimal free space (1.21%).
If most tuples are deleted and the VACUUM command is executed, many pages may become empty (e.g., 86.97% free space), but the total number of pages remains unchanged, indicating that the table file has not been compacted:
testdb=# DELETE FROM accounts WHERE aid %10 != 0 OR aid < 100;
DELETE 90009
testdb=# VACUUM accounts;
VACUUM
testdb=# SELECT count(*) as "number of pages",
pg_size_pretty(cast(avg(avail) as bigint)) as "Av. freespace size",
round(100 * avg(avail)/8192 ,2) as "Av. freespace ratio"
FROM pg_freespace('accounts');
number of pages | Av. freespace size | Av. freespace ratio
-----------------+--------------------+---------------------
1640 | 7124 bytes | 86.97
(1 row)
To inspect the free space ratio for each page of a table, the following query can be used:
testdb=# SELECT *, round(100 * avail/8192 ,2) as "freespace ratio"
FROM pg_freespace('accounts');
blkno | avail | freespace ratio
-------+-------+-----------------
0 | 7904 | 96.00
1 | 7520 | 91.00
2 | 7136 | 87.00
3 | 7136 | 87.00
4 | 7136 | 87.00
5 | 7136 | 87.00
....
In such a situation, running VACUUM FULL compacts the table file:
testdb=# VACUUM FULL accounts;
VACUUM
testdb=# SELECT count(*) as "number of blocks",
pg_size_pretty(cast(avg(avail) as bigint)) as "Av. freespace size",
round(100 * avg(avail)/8192 ,2) as "Av. freespace ratio"
FROM pg_freespace('accounts');
number of pages | Av. freespace size | Av. freespace ratio
-----------------+--------------------+---------------------
164 | 0 bytes | 0.00
(1 row)
By compacting the table, VACUUM FULL reduces the physical size of the table file, making it more efficient.