The EDB Postgres Backup and Recovery Tool development team is super busy developing features for the next major release and I wanted to give everyone a sneak peek into the exciting new capabilities to be delivered as part of BART 2.2.
The last major release of BART was published in June 2017 and we are already well underway in developing exciting new features for BART 2.2. The beta is expected by March 2018.
The highlight of BART 2.0 was block level incremental backup, a groundbreaking feature as there is no other Postgres-based backup and recovery solution that provides block level incremental backup. There are solutions that provide file level incremental backup, but they no way near provide the performance and reliability of block level incremental backup.
The features under development for BART 2.2 are also groundbreaking in their own right and I will provide a short summary of each feature in this blog. More details will follow once the release is out as, after all, this is only a sneak peek. ;-)
1. Parallel full backup
All the previous releases of BART used pg_basebackup as the underlying tool for performing a full backup of the database cluster. While pg_basebackup is a great tool and it has helped immensely in the ability to deliver stable releases of BART without making many invasive changes to support full backup functionality. The pg_basebackup utility also has some limitations, the largest one being that it doesn't offer any parallelism while taking a backup, which makes it slower than other Postgres-based backup solutions that offer parallel backups.
BART 2.2 will no longer use pg_basebackup for taking a full backup. Instead, it will use its own custom approach to perform full backup. The approach that BART 2.2 uses for implementing full backup is analogous to the approach used by BART for implementing block level incremental backup. The BART incremental backup approach uses a WAL scanner to identify the changed blocks and then fetches those changed blocks from the database server to form the incremental backup. In a nutshell, the full backup approach implemented by BART 2.2 will use the incremental backup approach but instead of getting the changed blocks, it will copy all the data files from the database server.
BART 2.2 will allow the user to specify "thread_count" in the bart.cfg file. The thread_count will launch the specified number of worker threads for performing the file copy for full backup. The approach is so far looking very promising and test benchmarking done in the labs is showing very positive results.
"In the best case scenario, BART 2.2 with 12 parallel threads took around 35 minutes to backup 1TB of database. That's more then 220% faster then pg_basebackup which took just over 114 minutes on the same environment."
The above benchmark was performed on a dedicated AWS instance with BARTHOST and database server running on different machines. Both BART 2.2 and pg_basebackup test were done on the same environment to get an apples-to-apples comparison. In the case of BART 2.2, the parallel processing with 12 parallel threads has produced a significant performance gain.
2. Parallel restore for Incremental backup
A lingering area for improvement in the BART 2.0 release was that BART incremental backup restore pretty much provided the same performance as restoring a full backup and replaying the WAL files. The BART 2.2 release will provide the ability of parallelism in restoring the incremental backup. This feature requires BART running on the restore host although this requirement was added in the BART 2.0 release for restoring incremental backups. The parallel incremental backup restore feature will allow the users to specify "workers" in the BART Restore command; this can be specified passing "--workers value" or passing "-w value" switch in the BART restore command. BART will spawn the specified number of worker threads for performing the incremental backup restore.
Prior to this feature, we restored an incremental backup by copying all of the .blk and .cbm files to the restore host, then we ran "bart APPLY-INCREMENTAL" on the restore host to splice the .blk files into the relfiles.
With this feature, we don't copy the .blk files to the restore host; instead, we stream them from the BART host to the restore host. For example, if you specify that you want 4 worker processes, we fire up 4 "receiver" processes on the restore host and 4 "streamer" processing on the BART host. The stdout of each streamer is connected to the stdin of a receiver process. The streamer and the receiver both work from the same .cbm file. When the receiver gets to the point where it needs a .blk file, it reads those modified blocks (not the bitmaps, but the actual 8K blocks themselves) from stdin - since the streamer process is reading the same .cbm file (in the same order) it sends those blocks to stdout in the same sequence.
The advantage to this mechanism is that we never write the .blk files to disk (on the restore host).
So we cut the disk I/O in half (on the restore host). And, we require much less disk space (on the restore host) since we don't have to write the .blk files to disk before we splice the blocks into the relfiles.
3. Parallel Compression
The backup compression provided by BART 2.0 or prior releases was client level compression. The user would use the -z switch with the backup command to enable compression and provide compression level from 1..9 using the -c switch. BART 2.2 will use the same interface as BART 2.0 but make the compression operations parallel in order to improve the overall time for compressed backup. The thread_count parameter specified in bart.cfg for making the backup process parallel spawns the specified number of processes for performing the backup process parallel. The same set of processes will be used for compression operation in order to provide parallel compressed backups.
So this is pretty much the sneak peek for BART 2.2 at this point. More on this later, stay tuned....
4. BART 2.3 Sneak Peek
While the features for BART 2.3 are still in planning phase, I will just mention one feature already scheduled BART 2.3. We are calling the feature flexible location for now.
The bart.cfg file contains parameter name "backup_path" which specifies the path where backups are stored. This is a global parameter in the default bart.cfg file. The user can only give a path to the local machine that is running the BARTHOST. The backup_path doesn't support a remote location which is the reason driving the Flexible Location feature. The goal of flexible location feature is to give users the ability to store their backups on a location other than the BARTHOST.
The flexible location feature will allow the user to specify the remote host and path on the remote host where backups should be kept. Right now the backup location is a global location which means backups of all the servers will be stored in the global location. With the flexible location feature in BART 2.1, the user can continue to use global backup location or specify backup location for each backup server individually.
All the BART functionality, (i.e. Manage, Delete, Restore, etc.) will continue to work with BART allowing flexible backup locations.
The user will have the flexibility to attach a high speed driver with the remote backup location or have a high speed network connection between the database server and its backup location. Eventually, this feature is also steering BART in the direction of supporting other storage devices and supporting other backup vendors that are supporting the XBSA standard. The next major release after BART 2.2 will support other backup vendors and other storage devices.
Ahsan Hadi is Senior Director, Product Development, at EnterpriseDB.