Apr 23, 2020
Backups are often taken based on suggestions and recommendations of the person who we think knows the best. We do not try to understand what situations would have led to those recommendations and how they might be different from what we might face. We need to ensure that we list down the situations we might need to recover from and then discuss the possible solutions available. Every environment has its own challenge—and even if we have multiple database clusters in the same organizations, the backup requirements for those might also vary. So, planning becomes the key to backup and recovery.
Write Ahead Logs (WAL) & Delayed Replica of Replica Databases
Why We Need to Chalk Our Citations We Might Face
Here’s a typical situation: my-coworker accidentally dropped a table. We have a monthly backup policy, and I have the backup that was taken 29 days ago. However, I took the backup of the data folder immediately after my co-worker’s mistake, and I also have all the WAL files since the time he dropped it—can you help recover the table?
Let's pause here to take stock and understand what we have at hand to recover. We have two backups (the latest backup, and the 29 day-old previous backup), as well all the WAL files since the table was dropped. WAL files are Write Ahead Logs, which means we can move forward from the point when a backup was done. In the above scenario, the latest backup was taken after the occurrence of the issue—so while it’s current, we have to accept that the table is gone, as it is a backup for that point in time only. The month-old backup might be the best way to recover, albeit at a cost of a month’s data loss.
Another question that is often raised in these conversations: “We have multiple replicas, so do I still need to plan and take backups?” While it’s good that you can recover from a hardware failure, recovery from human error can be much more expensive, as detailed in the example above.
Circling back to the beginning of the example we started with, my-coworker has dropped the table, and it replicated to all my replicas instantaneously. Trust me, you will find out that it’s the fastest transaction that you will ever see, no matter whether you are on cloud or on-premises, and any number of replicas that you have are not enough to save you from this disaster (or any other human error) unless you have a delayed standby in your cluster.
Making use of a delayed replica allows you to have an insurance plan of sorts for your quorum by using recovery_min_apply_delay parameter in the postgresql.conf file of the “break glass in case of emergency” standby. You can set this rather easily:
recovery_min_apply_delay = '1h'
The value can be configured as hours, minutes or days as it shows in the documentation here.
PostgreSQL Configuration & WAL (Write Ahead Logs)
Understanding What Happens Behind the Scenes
PostgreSQL is a very unique database in its implementation of transaction logs and process architecture. WAL (Write Ahead Log) is one of the most important pieces of the PostgreSQL ecosystem. The files are written to pg_xlog / pg_wal directory. Every change made is first written to WAL, hence its name, and the WAL is considered a synonym for transaction logs. There are two minimum settings required: archive_mode & archive_command. In order to ensure consistency each WAL is written and checked for completion. This makes the base of the recovery process. Both these parameters can be set in the postgresql.conf. archive_mode only has three values, Off On or Always. There isn’t much difference in on or always, while it is on the “on” value, it only creates WAL archives when the server is up and running while it is on “always” value, it continues to generate the archives even when they are being restored. archive_command is the shell command which is used to archive the WAL files. An example below from the PostgreSQL documentation:
archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
Here %p in the string is automatically replaced by the path name of the file to archive, and any %f is replaced by only the file name.
Now, let’s understand what goes behind the scenes while we are recovering from a backup. When the PostgreSQL database starts, it starts from a consistent state—the state where the checkpoint has been completed at the time of stopping or taking the backup of the database. Then, it checks for the WAL files, and begins reading them until it finishes all of them. Here is just an idea about how it appears in the log files:
2019-10-26 14:06:16 UTC [16-3] LOG: starting archive recovery 2019-10-26 14:06:16 UTC [16-4] LOG: restored log file "000000010000000000000017" from archive 2019-10-26 14:06:16 UTC [16-5] LOG: redo starts at 0/17000060 2019-10-26 14:06:16 UTC [16-6] LOG: consistent recovery state reached at 0/17000160 cp: cannot stat '/var/lib/postgresql/backup/000000010000000000000018': No such file or directory 2019-10-26 14:06:16 UTC [16-7] LOG: redo done at 0/17000160 2019-10-26 14:06:16 UTC [16-8] LOG: restored log file "000000010000000000000017" from archive 2019-10-26 14:06:16 UTC [16-10] LOG: archive recovery complete
While trying to recover from a backup, we use the recovery.conf to guide the master to a consistent state. In versions below 12, the parameters are set in the recovery.conf file however moving forward, the parameters would be available in PostgreSQL.conf. The server uses recovery_target_time to understand what point it needs to recover and hence when it is set to latest as explained above, it reads all the WAL files.
Point in Time Recovery (PITR) for PostgreSQL
Another iteration of the above which can be used for Point in Time Recovery (PITR) is setting the parameter recovery_target_time to a time before the point where we think the issue occurred. Taking the example at the beginning of this article, if we had all the WAL archives since the base backup, we could have gone ahead and set the time to a known time before the table was dropped and we could have recovered most of the data in the table.
It is worth mentioning here now that we understand that WAL files are the brick and mortar of recovering from any situation, it always helps to keep the pg_xlog/pg_wal folder with all the WAL files and do not remove any WAL files manually to recover disk space. The server ensures that any WAL files that are not required are cleaned up at routine intervals.
Backups and the Buts That Can Help pPan
Now it comes down to how and what kind of backups are available and where they can help us. Our options with PostgreSQL are as below:
A Logical dump is nothing more than a SQL dump of the database and its objects.
— File system snapshots
— Manual base backup
Backup time will depend upon the method used to take backup and the database size. This also leads to consideration of your space requirement to keep backups. If you are taking physical backups, the space requirement may be much higher as compared to logical backup. However, one needs to plan their backup strategy based upon their RPO (Recovery Point Objective) and RTO (Recovery Time Objective). Let's see some recovery scenarios (Recovering from a table to complete cluster) and recovery options.
Let's have a look at individual recovery objets and understand what’s the fastest backup method we can use:
- Recovering a single table: The fastest way to recover a table is from the SQL dump created using pg_dump or pg_dumpall. Since it is a plain-text file (can also be extracted from a custom format created by pg_dump -Fc) you can search and restore, in two steps, the table definition and data of the table. This gives you flexibility to even go ahead and search for certain records and restore them.
- Recovering a single database: Again, ignoring the size factor for the moment, the fastest way would still be pg_dump and pg_dumpall.
The best part is since the restore involves individual SQL files running a COPY command at the backend, query tuning techniques can be applied to maximize the performance. The same can be tested and documented in a test environment to get to an optimum configuration.
- Recovering an entire cluster: A cluster might consist of multiple databases as well as multiple extensions, so it becomes very important that everything be restored to the closest state possible. Here the best way is to use the snapshots using either File System Level backups or pg_basebackup. This ensures that you do not have to do any manual work and you just need to point to the data directory (after correcting permissions) and you can start your cluster.
Managing PostgreSQL Backup & Recovery
It is very easy to say that one should keep all types of backups to be able to recover from all the scenarios above, but we do understand that managing backups and keeping them in all formats is not only time consuming but also requires hardware. Imagine me asking you to take a pg_dump of a 1 TB database and I am sure the reaction would be “Ya right!” So here are a few recommendations based on the size of the database (or a cluster) so as to plan the best possible recovery in most of the scenarios:
- DB Size less than 100 Gb : Since this would require much less space and time, the recommendation here is to take both kinds of backup (physical as well as logical). Doing so will give you some independence in order to recover quickly no matter what the situation is.
- DB Size between 100 Gb and 500 Gb : The recommendation would be to take both kinds of backups, but if you do have to choose between the two, the recommendation would be to use physical backups. It gives you some flexibility (with some time and effort) to restore the backup and extract any tables or database to recover from the failures at that level.
- DB size greater than 500 Gb : This is where using pg_dump becomes very challenging, and a physical backup is most likely the best path to go on. As suggested in point above, it can help recover individual objects and databases as well by doing some due diligence.
Frequency of PostgreSQL Database Backups
Time to List Your Ifs, Ands & Buts to Build the Best Strategy
The last, but not the least, consideration in this whole picture is the frequency of the backups. This is where generalized suggestions reach their end. In order to even make any comments on this, one needs to understand the mechanics of their environment: the frequency of the transactions, the rate of change in the database size, the criticality of the application, and its risk tolerance when it comes to availability versus data loss. You need to carefully study these factors, and then do some test runs with the backups; this helps you not only verify the recoverability, but also to understand the time taken to recover. Remember, even if you have a monthly frequency of the backup and all the WAL files in between, the time it takes to read all of these WAL files and replay the transactions may take longer than you can tolerate to get back up and running.
So, coming back to where I started: take suggestions from the experts, but make sure you test those suggestions in your environment and tailor the plan to fit the strategy and outcomes you need, so that you are ready with the right backup and methods when the need arises.
Note to everyone: There is no tool to verify that the backup of PostgreSQL which you have taken is valid until you are able to restore it and read the data. So make sure in your backup strategy you not only have a schedule for taking the backups, but also a window to restore and test the backups.
Join Postgres Pulse Live!
As always, this example is one we’ve seen customers face many times, and it is always a unique discussion to help understand both what the best solution is to the immediate problem and guidance in how to then set about an evaluation of their existing strategy. Different businesses have different needs. Have problems or questions of your own?
Join us on Monday, April 27th, for our next Pulse Live Session! We’ll continue the dialogue on backup woes, successes and strategies— and we encourage you to bring any questions you have to the conversation.
You can ask your questions via email at email@example.com, #PostgresPulse on Twitter, or live during the event!
Interested in previous recordings, blogs or anything related to Postgres Pulse? We’ve made it easy for you to find it all in one place here.