PostgreSQL and EDB Postgres Advanced Server (EPAS) are incredibly flexible database servers that offer an almost infinite variety of options for deployment. This can be daunting for those that are just starting on their Postgres journey, so to help those users, we present a set of Reference Architectures to demonstrate a number of common options offering support for different use cases.
The options offered consist of a number of core architectures for deployment of the database server, along with a number of "add-on" architectures that illustrate how additional functionality may be added as needed.
The architectures are presented through a Github repository which can be found here. Pull requests with any suggested improvements or enhancements are welcome.
At present, the architectures are descriptive in nature. Work is underway to supplement the documentation with additional assets to aid in the deployment of the architectures using tools such as Ansible.
Core Architectures
Three core architectures are offered at present.
Single Node
The single node architecture is by far the most simple. Typical uses include development and test/QA servers, or servers where recovery time following a failure is not of concern. As the name implies, this is a simple installation of PostgreSQL or EPAS on a single node.
Multi Node with Asynchronous Replication
This architecture utilises two or more database server instances with asynchronous replication between them. This offers high availability, as the primary node can quickly be replaced with a standby in the event of a failure. At least three nodes in total are recommended to ensure that high availability can be maintained during maintenance of a node or following a failure.
This architecture can also provide read scaling through the use of automated or application based connection routing. Because asynchronous replication is used, the application should be tolerant of replication lag when writing to the primary node and reading from a replica.
Automatic failover is offered through the use of Failover Manager (EFM) which will monitor the status of the cluster and reconfigure it in the most optimal way following a failure.
Typical use cases include systems that must be highly available, but do not involve critical transactions that cannot be lost, such as financial transactions.
Recovery Time Objective: < 60 seconds, dependent on EFM tuning (more aggressive tuning may lead to unnecessary or false positive fault detection in the event of network latency or slow server response.
Recovery Point Objective: Typically a few seconds or less during failover (un-replicated transactions may be lost). Dependent on backup strategy for full recovery.
Geographic Redundancy Objective: Dependent on node placement. Typically multi-availability zone in cloud deployments.
Target Availability: 99.99% (99.999% may be possible with failover tuning).
Multi Node with Synchronous Replication
This architecture utilises three or more database server instances with synchronous replication between them. This offers high availability, as the primary node can quickly be replaced with a standby in the event of a failure. At least three nodes in total are required to ensure that high availability can be maintained during maintenance of a node or following a failure.
This architecture can also provide read scaling through the use of automated or application based connection routing.
Automatic failover is offered through the use of EFM which will monitor the status of the cluster and reconfigure it in the most optimal way following a failure.
Typical use cases include systems that must be highly available, and involve transactions that cannot be lost once committed.
Recovery Time Objective: < 60 seconds, dependent on EFM tuning (more aggressive tuning may lead to unnecessary or false positive fault detection in the event of network latency or slow server response.
Recovery Point Objective: Typically a few seconds or less during failover, with no loss of committed transactions. Dependent on backup strategy for full recovery.
Geographic Redundancy Objective: Dependent on node placement. Typically multi-availability zone in cloud deployments.
Target Availability: 99.99% (99.999% may be possible with failover tuning).
Add-on Architectures
Add-on architectures are provided for pooling and query routing, monitoring and management, and backup.
Load Balancing with pgBouncer
pgBouncer may be used to provide connection pooling in front of one or more Postgres servers. It can also provide simple round-robin based load balancing.
pgBouncer is not able to automatically route queries to different servers based on whether or not they are write or read-only in nature. This means that routing must be handled by the application; pgBouncer supports this by presenting different connection pools with different configurations as virtual databases that the application can connect to based on it's needs.
Lack of automatic query routing support does significantly simplify pgBouncer when compared to pgPool, so if this feature is not required, pgBouncer should typically be used.
Load Balancing and Query Routing with pgPool
Like pgBouncer, pgPool can be used to provide connection pooling and load balancing for one or more Postgres servers. However, pgPool also offers connection routing; it can parse and analyse the incoming queries from the client application and route them based on whether or not they are read or read/write queries automatically.
Though powerful and often useful, this functionality should be used with care; If a SELECT query is made which will cause database writes (perhaps through an auditing trigger), it may not be able properly recognise the query as a read-write one. Applications should be designed to ensure that database writes are not hidden behind triggers or function/procedure calls that appear to be read only in nature.
Monitoring with Postgres Enterprise Manager
Postgres Enterprise Manager (PEM) is a comprehensive monitoring and management solution for PostgreSQL and EPAS. Built on the popular pgAdmin 4 tool, it offers all the ah-hoc management functionality of pgAdmin, as well as 24x7 monitoring, alerting and en-masse management tools for mission critical and/or large Postgres estates.
Backup and Recovery with BART
Virtually all databases require backups. Backup and Recovery Tool (BART) is a convenient tool for creating and managing backups of one or more Postgres servers. It supports full and incremental backups, and offers Point In Time Recovery support to allow you to roll back changes to a database to a particular point in time following data loss events such as an accidental unconstrained delete.
Summary
EDB Reference Architectures are designed to help new and existing users alike to quickly design a deployment architecture that suits their needs. They can be used as either the blueprint for a deployment, or as the basis for a design that enhances and extends the functionality and features offered.
Add-on architectures allow users to easily extend their core database server deployment to add additional features and functionality "building block" style.
Visit the Github repository to start using the reference architectures today.