Barman 2.10 - Recovery of partial WAL files

December 11, 2019

Being a PostgreSQL database administrator and managing backups on a daily basis, one of the new features of Barman 2.10 that I liked the most is the transparent management of .partial WAL files during recovery. And, most likely, you feel the same if you have been using Barman with WAL streaming and have been asked to perform a full recovery of an instance.

The context: RPO=0 backups and recovery

The concept of RPO (Recovery Point Objective) is crucial in the process of Disaster Recovery. It represents the maximum amount of data you can afford to lose.

When applied to a Barman and PostgreSQL databases scenario, RPO essentially depends on the way we ship WAL files from the database to Barman: via archive_command or via streaming (asynchronous or synchronous).

The archive_command is the traditional method to archive WAL files.

It sends a WAL file to the Barman server once that WAL segment is completed or when the archive_timeout is reached (remember to set archive_timeout to cap the RPO of your database instance!).

The streaming method relies on pg_receivewal, which exploits the native streaming replication protocol and continuously receives transaction logs from the PostgreSQL server.

pg_receivewal writes transactional information in a file with .partial suffix after the WAL segment name and places it inside the streaming directory of the server in Barman. When completed, it removes the .partial suffix (handing it to Barman archiver process) and continues with the next WAL file.

Ideally, the goal of each company is to keep the RPO as close to zero as possible. For this reason, in this article, we focus on this use case: async/sync WAL streaming from the PostgreSQL primary.

While Barman has been able to fulfill the RPO=0 requirement in terms of backup since version 2.0, the recovery process of the .partial file was not transparently managed by the recover command and had to be performed manually by the users (the latest partial file from the server’s streaming directory had to be manually copied to the destination for recovery, making sure that the .partial suffix was removed).

Recovery of partial WAL files in Barman 2.10

The new Barman 2.10 enhances the user’s experience and fixes the issue in the recovery process by automating the management of the partial file.

This is possible through a new feature which allows to recover automatically the .partial WAL file thanks to the addition of the --partial/-P option to the get-wal command.

The get-wal command is responsible to retrieve WAL files from Barman, transforming the Barman server into a WAL hub for your servers since version 1.5.0. With the --partial/-P option, the get-wal command is able to retrieve also partial WAL files searching in the streaming directory, in case a WAL file has already been shipped to Barman, but not yet archived.

This is particularly useful in case of a sudden and unrecoverable failure of the master PostgreSQL server. The .partial file that has been streamed to Barman contains very important information that the standard archiver (through PostgreSQL’s archive_command) has not delivered to Barman. Now, performing a recovery, you can run the recover command with get-wal enabled, and without --standby-mode. In this case, Barman will automatically add the -P option to barman-wal-restore (which will then relay that to the remote get-wal command) in the restore_command recovery option.

NOTE: barman-wal-restore is part of the barman-cli package.

Let’s execute a recovery with Barman 2.10:

barman recover \
--remote-ssh-command 'ssh postgres@pg02' \
pg01 latest /opt/postgres/data/

The recovery produces the following recovery.conf file (PostgreSQL 11 or older) or postgresql.auto.conf (PostgreSQL 12 and future releases):

# The barman-wal-restore command is provided in the barman-cli package

restore_command = 'barman-wal-restore -P -U barman backup node1 %f %p'

As you can see the barman-wal-restore contains the -P option, which will propagate the request to Barman’s get-wal command, returning also the content of the latest partial file:

2019-11-17 15:30:56,826 [19952] barman.server INFO: Sending WAL '000000010000000000000014.partial' for server 'pg01' to standard output (SSH host: XXX.XXX.XX.X)

Conclusions

As usual with any new release of Barman, we recommend that everyone updates their systems to the latest version. The 2.10 contains several bug fixes and enhancements, including:

  • verification of the PostgreSQL instance’s system identifier in the check command
  • a new server/global option called create_slot which controls automated creation of the replication slot
  • barman-cloud-wal-archive script to directly ship WAL files to AWS S3 for permanent storage in the cloud
  • barman-cloud-backup script to perform full base backups and to ship them directly to AWS S3 for permanent storage in the cloud.

 

Share this

More Blogs