PostgreSQL Disaster Recovery with Kubernetes’ Volume Snapshots

August 21, 2024

See how CloudNativePG enhances disaster recovery efficiently with Kubernetes Volume Snapshots.

Transforming Cloud Native PostgreSQL Management with Volume Snapshots

Enhanced backup and recovery solutions for Postgres in Kubernetes

A new era for Postgres in Kubernetes has just begun. Version 1.21 of CloudNativePG introduces declarative support for Kubernetes’ standard API for Volume Snapshots, enabling, among others, incremental and differential copy for both backup and recovery operations to improve recovery point objective (RPO) and Recovery Time Objectives (RTO).

The benchmarks on EKS that I present in this article highlight that backup and – most importantly – a few orders of magnitude now reduce recovery times of Very Large Databases (VLDB) compared to the existing object store counterpart. For example, in two minutes, I fully recovered a 4.5TB Postgres database from a volume snapshot. This enhancement is just the beginning, as we are planning to natively support more features provided by Kubernetes on the storage level.

Understanding Kubernetes Volume Snapshots for Postgres in Kubernetes

The standardization of volume snapshotting in modern Kubernetes Clusters

Volume snapshots have been around for many years. When I was a maintainer and developer of Barman, a popular backup and recovery open source tool for Postgres, we regularly received a request from a customer to integrate it with their storage solution supporting snapshots. The major blocker was the lack of a standard interface to control the storage snapshotting capabilities.

Kubernetes fixed this. In December 2020, Kubernetes 1.20 introduced volume snapshotting by enriching the API with the VolumeSnapshot, VolumeSnapshotContent, and VolumeSnapshotClass custom resource definitions. Volume snapshotting is now in every supported Kubernetes version, providing a generic and standard interface for:

  • Creating a new volume snapshot from a PVC
  • Creating a new volume from a volume snapshot
  • Deleting an existing snapshot

The implementation is delegated to the underlying CSI drivers, and storage classes can offer a variety of capabilities based on storage: incremental block-level copy, differential block-level copy, replication on a secondary or n-ary location in another region, and so on.

The main advantage is that the interface abstracts the complexity and storage management from the application, in our case, a Postgres workload. From a database perspective, incremental and differential backup and recovery are the most desired features that volume snapshotting brings.

All major cloud providers have CSI drivers and storage classes supporting volume snapshots (for details, see GKE, EKS, or AKS). On-premise, you can use Openshift Data Foundation (ODF) and LVM with Red Hat, Longhorn with Rancher, and Portworx by Pure Storage, to cite a few. You can find a detailed list of available drivers in the official document of the Kubernetes Containers Storage Interface (CSI) project.

Backup and Recovery Strategies Before CloudNativePG 1.21

Evaluating backup options for your Kubernetes clusters

Before version 1.21, CloudNativePG supported backup and recovery only on object stores.

Object stores are convenient in many contexts, especially in Cloud environments and with small/medium-sized databases – I’d say below 500GB. Still, it depends on several factors, and there's no clear distinction. One of them is the time it takes to back up a database and store it in an object store. However, the most important one – at least for the scope of this article – is the time to restore from a backup safely secured in an object store: this metric represents the RTO in your Business Continuity Plan for that specific database.

I suggest measuring both times before deciding on whether they are acceptable. Based on my tests and experience, for a 45GB database, backup time might be in the order of 60-100 minutes, while recovery time may be in the 30 to 60 minutes range (these might change for the better or worse depending on the actual object store technology underneath). Time linearly increases with the database size without incremental and/or differential copy, proving inadequate for VLDB use cases.

For this reason, following some conversations with some members of CNCF TAG storage at KubeCon Europe 2023 in Amsterdam in April 2023, we decided to introduce imperative support for backup and recovery with Kubernetes volume snapshots through the cnpg plugin for kubectl (CloudNativePG 1.20). This allowed us to have a fast prototype of this feature and then enrich it with a declarative API.

Disclaimer: Other Postgres operators for Kubernetes provided information using volume snapshots for backup and recovery. Given that these instructions are imperative and our operator is built with a fully declarative model, I don’t cover them in this article. Other operators rely on Postgres-level backup tools for incremental backup/recovery; having conceived Barman many years ago, we could have gone down that path, but our vision is to rely on the Kubernetes way of making incremental backup/recovery to facilitate the integration of Postgres in that ecosystem. Nonetheless, I advise you to evaluate those alternative solutions and compare CloudNativePG with all available Postgres operators before making your own decision.

See How CloudNativePG Ranks Among Operators

Enhancements in Backup Capabilities with CloudNativePG 1.21

Achieving reliable restorations with volume snapshots

CloudNativePG lets you use object store and volume snapshot strategies with your Postgres clusters. While the WAL archive containing the transactional logs still needs to reside in an object store, physical base backups (copies of the PostgreSQL data files) can now be stored as a tarball in an object store, or as volume snapshots.

The WAL archive is required for online backup and Point-in-Time Recovery (PITR).

The first implementation of volume snapshots in CloudNativePG has a limitation worth mentioning: it only supports Cold (physical) Backup, which is a copy of the data files taken when the DBMS is shut down. As bad as this may sound, you shouldn’t worry; A production cluster typically has at least one replica, and the current Cold Backup implementation takes a full backup from a standby without impacting your primary operations. This limitation will be removed in version 1.22 with the support of PostgreSQL's low-level API for Hot Physical Base Backups.

In any case, Cold Backups are a statically consistent physical representation of the entire database cluster at a single point in time (a database snapshot, not to be confused with volume snapshots), and, as a result, they are sufficient to restore a Postgres cluster.

For example, if your RPO is 1 hour for data from the past, you can fulfill it with an hourly volume snapshot backup, retaining data from the last seven days.

Volume Snapshot Backup for Cloud Native PostgreSQL

Ensuring reliable backups for volume snapshots

Before you proceed, make sure you have the name of the storage class and the related volume snapshot class. Given that they vary from environment to environment, I will be using a generic pattern in this article: <MY_STORAGE_CLASS> and <MY_VOLUMESNAPSHOT_CLASS>.

IMPORTANT: In this article I won’t be covering any specific storage class or environment. However, you can apply the examples in this article in every environment, just by making sure you use the correct storage class and volume snapshot class.

You can enable volume snapshotting for physical base backups just by adding the volumeSnapshot stanza in the backup section of a PostgreSQL Cluster resource.

Suppose you want to create a Postgres cluster called hendrix with two replicas, reserving a 10GB volume for PGDATA and a 10GB volume for WAL files. Suppose that you have already set up the backups on the object store so that you can archive the WAL files there (that’s the barmanObjectStore section which we leave empty in this article as it is not relevant).

apiVersion: postgresql.cnpg.io/v1 kind: Cluster metadata: name: hendrix spec: instances: 3 storage: storageClass: <MY_STORAGE_CLASS> size: 10Gi walStorage: storageClass: <MY_STORAGE_CLASS> size: 10Gi backup: # Volume snapshot backups volumeSnapshot: className: <MY_VOLUMESNAPSHOT_CLASS> # For the WAL archive and object store backups barmanObjectStore: …

You can directly create a Backup resource, but my advice is to either:

  • Use the ScheduledBackup object to organize your volume snapshot backups for the Postgres cluster on a daily or hourly basis, or
  • Use the backup -m volumeSnapshot command of the cnpg plugin for kubectl to get the on-demand Backup resource created for you.

In this case, I used the plugin:

kubectl cnpg backup -m volumeSnapshot hendrix

The plugin will create a Backup resource following the hendrix-<YYYYMMDDHHMMSS> naming pattern, where YYYYMMDDHHMMSS is the time the backup was requested.

The operator then initiates the Cold Backup procedure, by:

  • Shutting down the Postgres server for the selected replica (fencing)
  • Creating a VolumeSnapshot resource for each volume defined for the cluster, in our case:
  • hendrix-YYYYMMDDHHMMSS for the PGDATA
  • hendrix-YYYYMMDDHHMMSS-wal for the WAL files
  • Waiting for the CSI external snapshotter to create, for each VolumeSnapshot, the related VolumeSnapshotContent resource
  • Removing, when completed, the fence on the selected replica for the Cold Backup operation

You can list the available backups and restrict them to the hendrix cluster by running:

kubectl get backup --selector=cnpg.io/cluster=hendrix

As you can see the METHOD column will report whether a backup has been taken via volume snapshots or object stores:

NAME AGE CLUSTER METHOD PHASE ERROR
hendrix-20231017150434 81s hendrix volumeSnapshot completed
hendrix-20231017125847 7m8s hendrix barmanObjectStore completed

Similarly, you can list the current volume snapshots for the hendrix cluster with:

kubectl get volumesnapshot --selector=cnpg.io/cluster=hendrix

Both backups and volume snapshots contain important annotations and labels that allow you to browse these objects and remove them according to their month or date of execution, for example. Make sure you spend some time exploring them through the describe command for kubectl.

You can then schedule daily backups at 5AM every morning by creating a ScheduledBackup as follows:

apiVersion: postgresql.cnpg.io/v1 kind: ScheduledBackup metadata: name: hendrix-vs spec: schedule: '0 0 5 * * *' cluster: name: hendrix backupOwnerReference: self method: volumeSnapshot

For more detailed information, please refer to the documentation on volume snapshot backup for CloudNativePG.

Volume Snapshot Recovery for Cloud Native PostgreSQL

Optimizing recovery procedures in Kubernetes

Recovery is what makes a backup useful. This is why it’s crucial to test the recovery procedure before you adopt it both in production and on a regular basis. The recovery process should be automated, opening up many interesting avenues in the areas of data warehousing and sandboxing for reporting and analysis.

Recovery from volume snapshots is achieved in the same way CloudNativePG recovers from object stores – by bootstrapping a new cluster. The only difference here is that instead of just pointing to an object store, you can request to create the new PVCs starting from a set of consistent and related volume snapshots (PGDATA, WALs, and soon, tablespaces).

All you need to do is create a new cluster resource (for example, hendrix-recovery ) with settings identical to the hendrix one, except for the bootstrap section. Here is an excerpt:

apiVersion: postgresql.cnpg.io/v1 kind: Cluster metadata: name: hendrix-recovery spec: # <snip> bootstrap: recovery: volumeSnapshots: storage: name: hendrix-YYYYMMDDHHMMSS kind: VolumeSnapshot apiGroup: snapshot.storage.k8s.io walStorage: name: hendrix-YYYYMMDDHHMMSS-wal kind: VolumeSnapshot apiGroup: snapshot.storage.k8s.io

When you create this resource, the recovery job will provision the underlying PVCs starting from the snapshots specified in the .spec.bootstrap.recovery.volumeSnapshots . Once completed, PostgreSQL will start.

I didn’t define any WAL archive in the example above, since the volume snapshots above were taken using a Cold Backup strategy, the only strategy available for now in CloudNativePG. As mentioned earlier, these are consistent database snapshots and are sufficient to restore to a specific point in time – the time of the backup.

However, suppose you want to take advantage of volume snapshots for lower RPO with PITR or for better global RTO through a replica cluster in a different region (the underlying storage class supports relaying volume snapshots across multiple Kubernetes clusters). In that case, you need to specify the location of the WAL archive by defining a source through an external cluster. For example, you can add the following to your hendrix-recovery cluster:

apiVersion: postgresql. cnpg. io/vl kind: Cluster metadata: name: hendrix—recovery spec: # <snip> bootstrap: recovery : source: hendrix volumeSnapshots: # <snip> replica: enabled: true source: hendrix externalClusters: - name: hendrix barmanObjectStore: # <snip>

The above manifest requests a new Postgres Cluster called hendrix-recovery, which is bootstrapped using the given volume snapshots, then placed in continuous replication by fetching WAL files from the Hendrix object store and started in read-only mode.

These are just a few examples. Don't be overwhelmed by the flexibility, freedom, and creativity you can unleash with both Postgres and the operator regarding architecture. Read this article on recommended architectures for PostgreSQL in Kubernetes from the CNCF blog for more ideas on what you can achieve.

Benchmark Results for Effective Recovery Objectives

Assessing performance of volume snapshots in Postgres in Kubernetes

Let’s discuss some initial benchmarks I’ve gathered based on volume snapshots using r5.4xlarge nodes on AWS EKS with the gp3 storage class. In the benchmarks, I defined four different database size categories (tiny, small, medium, and large), with details as follows:

Cluster name Database size pgbench init scale PGDATA volume size WAL volume size pgbench init duration
tiny 4.5 GB 300 8 GB 1 GB 67s
small 44 GB 3,000 80 GB 10 GB 10m 50s
medium 4.5 GB 30,000 800 GB 100 GB 3h 15m 34s
large 4.5 GB 300,000 8,000 GB 200 GB 32h 47m 47s

These databases were created by running the pgbench initialization process with different scaling factors, ranging between 300 for the smallest (taking just over a minute) to 300,000 for the largest (taking approximately 33 hours to complete, producing a 4.4TB database). The table above also shows the size of the PGDATA and WAL volumes used in the tests.

The experiment consisted of taking a first backup on volume snapshots, and then a second one after running pgbench for an hour. It's important to note that the first backup needs to store the entire volume content, while subsequent ones only store the delta from the previous snapshot. Each cluster was destroyed and then recreated starting from the last snapshot.

I decided not to replay any WAL file to prevent tainting the test results and introducing variability, so I measured the bare duration of the restore operation from the recovery of the snapshot until Postgres starts accepting connections (Or, in Kubernetes terms, until the readiness probe succeeds and the pod is ready).

The table below shows the backup and recovery results for each of them.

Cluster name 1st backup duration size 2nd backup duration after 1hr of pgbench Full recovery time
tiny 2m 43s 4m 16s 31s
small 20m 38s 16m 45s 27s
medium 2h 42m 2h 34m 48s
large 3h 54m 6s 2h 3s 2m 2s

All the databases restarted within two minutes (yes, minutes), including the largest instance of roughly 4.4TB. This is definitely an optimistic estimation since the actual time depends on several factors related to how the CSI external snapshotter stores deltas and, in most cases, should be less than the time taken for the first full backup.

Our advice, as usual, is to test it yourself since every organizational environment (which includes not just technology, but people too!) is unique.

Future Enhancements for CloudNativePG 1.21 and Beyond

Advancing volume snapshots and cloning features

As I mentioned earlier in this blog post, this first implementation is just the beginning of volume snapshots support in CloudNativePG.

We are already working on adding Hot Backup support in version 1.22, by honoring the pg_start_backup() and pg_stop_backup() interfaces to avoid shutting down an instance. This will require the presence of a WAL archive, which is currently available only in object stores.

The implementation of this feature will open up more exciting scenarios. Snapshotting is the foundation of PVC Cloning, which is the possibility of creating a new PVC using an existing one as a source. As a result, we can overcome an existing limitation in which scale-up and replica cloning are currently implemented with pg_basebackup only.

If you have a very large database, that could imply needing hours or days to complete the process (which is only sometimes critical, but often undesired). PVC Cloning will make this process faster.

Another area where PVC Cloning will enhance is with in-place upgrades of PostgreSQL (let’s say, from version 13 to 16). We have yet to introduce pg_upgrade support in CloudNativePG for a simple reason – there’s no way to roll back automatically if there’s an issue with any of the 15+ steps required by this critical operation.

At the same time, rolling back using a backup in an object store might not always be a good strategy (definitely not in the case of a VLDB). Our idea is to use PVC Cloning to create new PVCs to run pg_upgrade and swap the old PVCs with the new ones if everything goes as planned.

In case of failures, the upgrade can be aborted to resume with the existing cluster (with the untouched PVCs). Major PostgreSQL upgrades are currently possible with CloudNativePG. Read “The Current State of Major PostgreSQL Upgrades with CloudNativePG” for more information.

Snapshotting will be even more interesting when we introduce support for another global object in PostgreSQL – Tablespaces, which is expected in 1.22 as well. Tablespaces enable you to place indexes or data partitions in separate I/O volumes, for better performance and vertical scalability. Among the benefits of tablespaces, spreading your data in multiple volumes might decrease the time for backup and recovery, as snapshots can be taken in parallel.

We are also already following the progress of the Kubernetes VolumeGroupSnapshot feature, currently in alpha state, to achieve atomic snapshots among the different volumes (PGDATA, WALs, and tablespaces) of a Postgres in Kubernetes database.

Cloud Native PostgresSQP and Future Ready Strategies

Harnessing volume snapshots for enhanced recovery point objectives

Declarative support for Kubernetes’ Volume Snapshot API is another milestone in the evolution of CloudNativePG as an open and standard way to run PostgreSQL in a cloud native environment.

Although 1.22 and subsequent versions will make it even more evident, this version already takes the PostgreSQL VLDB experience in Kubernetes to another level, whether you are in the public cloud or on-premise; VM, or bare metal.

In an AI-driven world, volume snapshots change how you approach data warehousing with Postgres and how you create sandbox environments for analysis and data exploration.

In terms of business continuity, support for volume snapshots will give you, among other benefits:

  • Better RTO through faster restores from volume snapshots following a disaster
  • More flexibility on the RPO by adopting Cold Backup-only solutions, or implementing hybrid backup strategies based on object stores (with different scheduling)
  • Finer control on where to relay your data once the volume snapshot is completed by relying on the storage class to clone your data in different Kubernetes clusters or regions

Join the CloudNativePG Community now if you want to help improve the project at any level.

If you are an organization that is interested in moving your PostgreSQL databases to Kubernetes, don’t hesitate to contact us for professional support, especially if you are at an early stage of the process.

EDB provides 24/7 support on CloudNativePG and PostgreSQL under the Community 360 plan (Available for OpenShift soon). Suppose you are looking for longer support periods, integration with Kasten K10, and the possibility to run Postgres Extended with TDE or Postgres Advanced to facilitate migrations from Oracle. In that case, you can also look into EDB Postgres for Kubernetes, which is EDB’s product based on CloudNativePG and is available under the EDB Standard and Enterprise plans.

Join the CloudNativePG community

Share this
What is disaster recovery in the context of PostgreSQL? chevron_right

Disaster recovery in the context of PostgreSQL involves strategies and processes designed to protect and restore the database in case of data loss or corruption due to various incidents, such as hardware failures or human errors. This includes regular backups, which can be full or incremental, and the implementation of replication techniques to maintain up-to-date copies of the database on standby servers.

Effective disaster recovery plans also define RTO and RPO to minimize downtime and data loss during recovery operations. Additionally, tools and utilities like pg_dump for backups and configurations for continuous archiving play a crucial role in ensuring that a PostgreSQL database can be restored quickly and reliably after a disaster.
 

How does Kubernetes support PostgreSQL disaster recovery? chevron_right

Kubernetes provides orchestration capabilities that can automate the deployment, scaling, and management of PostgreSQL instances.

Kubernetes, in conjunction with tools like CloudNativePG, enables features such as declarative volume snapshots, which significantly reduce recovery times and improve RPO by allowing quick restoration of databases from snapshots. This capability is especially beneficial for managing large databases, as it can streamline the backup and recovery process, ensuring business continuity in a cloud-native environment.

What are volume snapshots in Kubernetes? chevron_right

Volume snapshots in Kubernetes are point-in-time copies of persistent volumes that provide a standardized way to back up and restore data. They are represented by the VolumeSnapshot API resource and can be used to quickly restore data in case of a failure, create clones for testing or development environments, or migrate data between clusters.

The Kubernetes volume snapshot feature is supported by Container Storage Interface (CSI) drivers and allows you to create snapshots, restore volumes from snapshots, and provision new volumes pre-populated with data from existing snapshots. This functionality is particularly useful for database administrators who need to back up databases before performing modifications or for achieving business-critical recovery point objectives in disaster recovery plans.

How can I create a volume snapshot for my PostgreSQL database? chevron_right

To create a volume snapshot, you can use the VolumeSnapshot API resource provided by Kubernetes. This involves defining a VolumeSnapshot object referencing the persistent volume claim (PVC) backing your PostgreSQL database.

The snapshot is then managed by a volume snapshot controller, which can be provided by your cloud provider or set up using tools like CloudNativePG, which simplifies the process through its defined Custom Resource Definitions (CRDs). Once the snapshot is created, it can be used to quickly restore the database in case of data loss or corruption.

What steps should I take to ensure the best disaster recovery practices for my PostgreSQL database? chevron_right

Regularly back up your data, implement automated failover mechanisms, monitor your database health, and perform routine recovery drills to ensure that your recovery process is effective when needed.

How often should backups be taken for PostgreSQL? chevron_right

Backup frequency for PostgreSQL should be tailored to your data update rate and business requirements. Many organizations implement daily full backups along with hourly incremental backups to minimize potential data loss. Additionally, consider more frequent backups for critical systems, such as every few minutes or hours, especially when using continuous archiving and PITR strategies to ensure data integrity and quick recovery options.

What are the benefits of using EDB's managed PostgreSQL service? chevron_right

EDB's managed PostgreSQL service provides expert support, automated backups, high availability configurations, and seamless integration with disaster recovery solutions, ensuring your database is resilient and recoverable.

Additionally, EDB offers enhanced security features, Oracle compatibility for easier migration, and 24/7 global support from PostgreSQL experts, allowing organizations to focus on innovation while EDB manages their database needs efficiently.

Can EDB help configure disaster recovery for my PostgreSQL databases? chevron_right

Yes, EDB offers professional services and consultancy to assist with configuring effective disaster recovery solutions tailored to your organization’s needs.

EDB’s expertise includes implementing best practices for backup and recovery, utilizing tools like EDB's Backup and Recovery Tool (BART), and ensuring high availability through advanced configurations. EDB’s comprehensive support helps organizations maintain business continuity and minimize downtime during unexpected incidents.

Are training resources available for understanding PostgreSQL disaster recovery? chevron_right

Yes, EDB offers a variety of training resources, including specialized courses and comprehensive documentation, to help users understand PostgreSQL disaster recovery best practices. EDB’s "Disaster Recovery and High Availability" course provides practical insights into implementing resilient databases, including hands-on exercises and demos.

Additionally, EDB’s training resources are designed to enhance your skills in managing backups, recovery strategies, and ensuring business continuity effectively.

What support options does EDB provide for PostgreSQL users? chevron_right

EDB offers a variety of support options, including standard and premium support packages, which provide users with 24/7 technical assistance from PostgreSQL experts.

These packages include proactive monitoring solutions, remote DBA services, and technical account management for personalized guidance. Additionally, EDB's support covers best practices for deployment, performance optimization, and disaster recovery, ensuring that users can effectively manage their PostgreSQL environments.

Ready to Get Started with PostgreSQL?

And enhance your cloud native experience?

More Blogs