Hint
For a deeper understanding, we recommend reading our article on the CNCF blog post titled "Recommended Architectures for PostgreSQL in Kubernetes", which provides valuable insights into best practices and design considerations for PostgreSQL deployments in Kubernetes.
This documentation page provides an overview of the key architectural considerations for implementing a robust business continuity strategy when deploying PostgreSQL in Kubernetes. These considerations include:
- Deployments in stretched vs. non-stretched clusters: Evaluating the differences between deploying in stretched clusters (across 3 or more availability zones) versus non-stretched clusters (within a single availability zone).
- Reservation of
postgresworker nodes: Isolating PostgreSQL workloads by dedicating specific worker nodes topostgrestasks, ensuring optimal performance and minimizing interference from other workloads. - PostgreSQL architectures within a single Kubernetes cluster: Designing effective PostgreSQL deployments within a single Kubernetes cluster to meet high availability and performance requirements.
- PostgreSQL architectures across Kubernetes clusters for disaster recovery: Planning and implementing PostgreSQL architectures that span multiple Kubernetes clusters to provide comprehensive disaster recovery capabilities.
Synchronizing the state
PostgreSQL is a database management system and, as such, it needs to be treated as a stateful workload in Kubernetes. While stateless applications mainly use traffic redirection to achieve High Availability (HA) and Disaster Recovery (DR), in the case of a database, state must be replicated in multiple locations, preferably in a continuous and instantaneous way, by adopting either of the following two strategies:
- storage-level replication, normally persistent volumes
- application-level replication, in this specific case PostgreSQL
EDB Postgres® AI for CloudNativePG™ Cluster relies on application-level replication, for a simple reason: the PostgreSQL database management system comes with robust and reliable built-in physical replication capabilities based on Write Ahead Log (WAL) shipping, which have been used in production by millions of users all over the world for over a decade.
PostgreSQL supports both asynchronous and synchronous streaming replication over the network, as well as asynchronous file-based log shipping (normally used as a fallback option, for example, to store WAL files in an object store). Replicas are usually called standby servers and can also be used for read-only workloads, thanks to the Hot Standby feature.
Important
We recommend against storage-level replication with PostgreSQL, although EDB Postgres® AI for CloudNativePG™ Cluster allows you to adopt that strategy. For more information, please refer to the talk given by Chris Milsted and Gabriele Bartolini at KubeCon NA 2022 entitled "Data On Kubernetes, Deploying And Running PostgreSQL And Patterns For Databases In a Kubernetes Cluster" where this topic was covered in detail.
Kubernetes architecture
Kubernetes natively provides the possibility to span separate physical locations - also known as data centers, failure zones, or more frequently availability zones - connected to each other via redundant, low-latency, private network connectivity.
Being a distributed system, the recommended minimum number of availability zones for a Kubernetes cluster is three (3), in order to make the control plane resilient to the failure of a single zone. For details, please refer to "Running in multiple zones". This means that each data center is active at any time and can run workloads simultaneously.
Note
Most of the public Cloud Providers' managed Kubernetes services already provide 3 or more availability zones in each region.
Multi-availability zone Kubernetes clusters
The multi-availability zone Kubernetes architecture with three (3) or more zones is the one that we recommend for PostgreSQL usage. This scenario is typical of Kubernetes services managed by Cloud Providers.
Such an architecture enables the EDB Postgres® AI for CloudNativePG™ Cluster operator to control the full
lifecycle of a Cluster resource across the zones within a single Kubernetes
cluster, by treating all the availability zones as active: this includes, among
other operations,
scheduling the workloads in a declarative manner (based on
affinity rules, tolerations and node selectors), automated failover,
self-healing, and updates. All will work seamlessly across the zones in a single
Kubernetes cluster.
Please refer to the "PostgreSQL architecture" section below for details on how you can design your PostgreSQL clusters within the same Kubernetes cluster through shared-nothing deployments at the storage, worker node, and availability zone levels.
Additionally, you can leverage Kubernetes clusters to deploy distributed PostgreSQL topologies hosting "passive" PostgreSQL replica clusters in different regions and managing them via declarative configuration. This setup is ideal for disaster recovery (DR), read-only operations, or cross-region availability.
Important
Each operator deployment can only manage operations within its local Kubernetes cluster. For operations across Kubernetes clusters, such as controlled switchover or unexpected failover, coordination must be handled manually (through GitOps, for example) or by using a higher-level cluster management tool.
Single availability zone Kubernetes clusters
If your Kubernetes cluster has only one availability zone, EDB Postgres® AI for CloudNativePG™ Cluster still provides you with a lot of features to improve HA and DR outcomes for your PostgreSQL databases, pushing the single point of failure (SPoF) to the level of the zone as much as possible - i.e. the zone must have an outage before your EDB Postgres® AI for CloudNativePG™ Cluster clusters suffer a failure.
This scenario is typical of self-managed on-premise Kubernetes clusters, where only one data center is available.
Single availability zone Kubernetes clusters are the only viable option when only two data centers are available within reach of a low-latency connection (typically in the same metropolitan area). Having only two zones prevents the creation of a multi-availability zone Kubernetes cluster, which requires a minimum of three zones. As a result, users must create two separate Kubernetes clusters in an active/passive configuration, with the second cluster primarily used for Disaster Recovery (see the replica cluster feature).
Hint
If you are at an early stage of your Kubernetes journey, please share this document with your infrastructure team. The two data centers setup might be simply the result of a "lift-and-shift" transition to Kubernetes from a traditional bare-metal or VM based infrastructure, and the benefits that Kubernetes offers in a 3+ zone scenario might not have been known, or addressed at the time the infrastructure architecture was designed. Ultimately, a third physical location connected to the other two might represent a valid option to consider for organization, as it reduces the overall costs of the infrastructure by moving the day-to-day complexity from the application level down to the physical infrastructure level.
Please refer to the "PostgreSQL architecture" section below for details on how you can design your PostgreSQL clusters within your single availability zone Kubernetes cluster through shared-nothing deployments at the storage and worker node levels only. For HA, in such a scenario it becomes even more important that the PostgreSQL instances be located on different worker nodes and do not share the same storage.
For DR, you can push the SPoF above the single zone, by using additional Kubernetes clusters to define a distributed topology hosting "passive" PostgreSQL replica clusters. As with other Kubernetes workloads in this scenario, promotion of a Kubernetes cluster as primary must be done manually.
Through the replica cluster feature, you can define a distributed PostgreSQL topology and coordinate a controlled switchover between data centers by first demoting the primary cluster and then promoting the replica cluster, without the need to re-clone the former primary. While failover is now fully declarative, automated failover across Kubernetes clusters is not within EDB Postgres® AI for CloudNativePG™ Cluster' scope, as the operator can only function within a single Kubernetes cluster.
Important
EDB Postgres® AI for CloudNativePG™ Cluster provides all the necessary primitives and probes to coordinate PostgreSQL active/passive topologies across different Kubernetes clusters through a higher-level operator or management tool.
Reserving nodes for PostgreSQL workloads
Whether you're operating in a multi-availability zone environment or, more
critically, within a single availability zone, we strongly recommend isolating
PostgreSQL workloads by dedicating specific worker nodes exclusively to
postgres in production. A Kubernetes worker node dedicated to running
PostgreSQL workloads is referred to as a Postgres node or postgres node.
This approach ensures optimal performance and resource allocation for your
database operations.
Hint
As a general rule of thumb, deploy Postgres nodes in multiples of three—ideally with one node per availability zone. Three nodes is an optimal number because it ensures that a PostgreSQL cluster with three instances (one primary and two standby replicas) is distributed across different nodes, enhancing fault tolerance and availability.
In Kubernetes, this can be achieved using node labels and taints in a
declarative manner, aligning with Infrastructure as Code (IaC) practices:
labels ensure that a node is capable of running postgres workloads, while
taints help prevent any non-postgres workloads from being scheduled on that
node.
Important
This methodology is the most straightforward way to ensure that PostgreSQL
workloads are isolated from other workloads in terms of both computing
resources and, when using locally attached disks, storage. While different
PostgreSQL clusters may share the same node, you can take this a step further
by using labels and taints to ensure that a node is dedicated to a single
instance of a specific Cluster.
Proposed node label
EDB Postgres® AI for CloudNativePG™ Cluster recommends using the node-role.kubernetes.io/postgres label.
Since this is a reserved label (*.kubernetes.io), it can only be applied
after a worker node is created.
To assign the postgres label to a node, use the following command: