Being a PostgreSQL database administrator and managing backups on a daily basis, one of the new features of Barman 2.10 that I liked the most is the transparent management of .partial
WAL files during recovery. And, most likely, you feel the same if you have been using Barman with WAL streaming and have been asked to perform a full recovery of an instance.
The context: RPO=0 backups and recovery
The concept of RPO
(Recovery Point Objective) is crucial in the process of Disaster Recovery. It represents the maximum amount of data you can afford to lose.
When applied to a Barman and PostgreSQL databases scenario, RPO essentially depends on the way we ship WAL files from the database to Barman: via archive_command
or via streaming (asynchronous or synchronous).
The archive_command
is the traditional method to archive WAL files.
It sends a WAL file to the Barman server once that WAL segment is completed or when the archive_timeout
is reached (remember to set archive_timeout
to cap the RPO of your database instance!).
The streaming method relies on pg_receivewal
, which exploits the native streaming replication protocol and continuously receives transaction logs from the PostgreSQL server.
pg_receivewal
writes transactional information in a file with .partial
suffix after the WAL segment name and places it inside the streaming
directory of the server in Barman. When completed, it removes the .partial
suffix (handing it to Barman archiver process) and continues with the next WAL file.
Ideally, the goal of each company is to keep the RPO as close to zero as possible. For this reason, in this article, we focus on this use case: async/sync WAL streaming from the PostgreSQL primary.
While Barman has been able to fulfill the RPO=0 requirement in terms of backup since version 2.0, the recovery process of the .partial
file was not transparently managed by the recover
command and had to be performed manually by the users (the latest partial file from the server’s streaming directory had to be manually copied to the destination for recovery, making sure that the .partial
suffix was removed).
Recovery of partial WAL files in Barman 2.10
The new Barman 2.10 enhances the user’s experience and fixes the issue in the recovery process by automating the management of the partial file.
This is possible through a new feature which allows to recover automatically the .partial
WAL file thanks to the addition of the --partial
/
-P
option to the get-wal
command.
The get-wal
command is responsible to retrieve WAL files from Barman, transforming the Barman server into a WAL hub for your servers since version 1.5.0. With the --partial
/-P
option, the get-wal
command is able to retrieve also partial WAL files searching in the streaming
directory, in case a WAL file has already been shipped to Barman, but not yet archived.
This is particularly useful in case of a sudden and unrecoverable failure of the master PostgreSQL server. The .partial
file that has been streamed to Barman contains very important information that the standard archiver (through PostgreSQL’s archive_command
) has not delivered to Barman. Now, performing a recovery, you can run the recover
command with get-wal
enabled, and without --standby-mode
. In this case, Barman will automatically add the -P
option to barman-wal-restore
(which will then relay that to the remote get-wal
command) in the restore_command
recovery option.
NOTE: barman-wal-restore
is part of the barman-cli
package.
Let’s execute a recovery with Barman 2.10:
barman recover \
--remote-ssh-command 'ssh postgres@pg02' \
pg01 latest /opt/postgres/data/
The recovery produces the following recovery.conf
file (PostgreSQL 11 or older) or postgresql.auto.conf
(PostgreSQL 12 and future releases):
# The barman-wal-restore command is provided in the barman-cli package
restore_command = 'barman-wal-restore -P -U barman backup node1 %f %p'
As you can see the barman-wal-restore
contains the -P
option, which will propagate the request to Barman’s get-wal
command, returning also the content of the latest partial file:
2019-11-17 15:30:56,826 [19952] barman.server INFO: Sending WAL '000000010000000000000014.partial' for server 'pg01' to standard output (SSH host: XXX.XXX.XX.X)
Conclusions
As usual with any new release of Barman, we recommend that everyone updates their systems to the latest version. The 2.10 contains several bug fixes and enhancements, including:
- verification of the PostgreSQL instance’s system identifier in the
check
command - a new server/global option called
create_slot
which controls automated creation of the replication slot barman-cloud-wal-archive
script to directly ship WAL files to AWS S3 for permanent storage in the cloudbarman-cloud-backup
script to perform full base backups and to ship them directly to AWS S3 for permanent storage in the cloud.