Beginner's Guide to High Availability for Postgres

Highly available databases are essential to organizations depending on mission-critical, 24/7 access to data. PostgreSQL is widely recognized as an excellent open-source database, with critical maturity and features that allow organizations to scale and achieve high availability.

This webinar explores:

  • High availability concepts and workings
  • RPO, RTO, and uptime in high availability
  • Postgres high availability using 
    • Streaming replication
    • Logical replication
  • Important high availability parameters in Postgres and options to monitor high availability.
  • EDB tools (EDB Postgres Failover Manager, BART etc) to create a highly available Postgres architecture

 

Webinar video:

 

 

Webinar slides:

 

 

Webinar Q&A:

Q: Is NetBackup compatible to use for Postgres backup?

A: Yes, Netbackup is compatible with to use for Postgres backup. Following is a blog post which can give more details

https://www.enterprisedb.com/blog/veritas-netbackup-and-edb-postgres

 

Q: With streaming replication, how does the standby sync up if the interconnect is disrupted?

A: If there is a network connection issue, then standby will re-try the connection with primary. We recommend using replication slots for standby connections to the primary. With the replication connection, the primary knows about the WAL files received by the standby and WAL files it needs to keep for standby. As soon as the connection between primary and standby establishes, standby will start streaming data with primary.

 

Q: Is there a limit to the number of streams?

A: There is no limit on the number of streams. Except you have to make sure you have proper settings on primary to support the number of standby/stream. On primary parameters like max_wal_senders should be set correctly.

 

Q: What is the recommended way to backup, on primary or one 2nd server?

A: It depends on the type of backup you want to take and the resources available on primary to support backup. If you are planning to take incremental backups using EDB Backup and Recovery Tool, then you have to take the backup from primary, because the incremental backup is not supported using standby. If you are taking periodic full backups and you have all archived WAL files from the primary, then you can use standby for the backup. 

 

Q: Is logical replication similar to materialized views?

A: No. Logical replication is another way of replicating data from Primary to logical standby. In the logical replication, the standby may not be an exact mirror of the primary and standby receives the data from the primary after processing the WAL on the primary server. Materialized views don’t use logical replication. Till 13, materialized views get data by manual execution of REFRESH MATERIALIZED VIEW <view name> command

 

Q: Are the views listed under "Monitor: Streaming Replication" only useful for the streaming rep option, or can they be used for the WAL shipping HA option as well?

A: Views under the “Monitor: Streaming Replication” slide is only useful for the streaming replication. For WAL shipping HA option you can only use the listed function in the slide “Monitor: WAL Shipping”

 

Q: For "the user would not see any service interruption": what happens with the existing connections when a failover occurs?

A:  When a failover occurs in the HA cluster, the existing connection will be disconnected. In the real-world people use poolers with retry connection options or they put logic in the application for re-trying the connection. Using a re-trying connection mechanism, the application ables to connect to new primary in HA clusters and App users don’t see service interruptions.

 

Q: How is the split-brain avoided in case of for example network split?

A: EDB Postgres Failover Manager manages the HA cluster which takes care of the split-brain scenario. Before performing any failover EDB Postgres Failover Manager check with the status of primary with other standbys and based on the status receives it performs the failover. After failover it also makes sure the old primary should not come back in the cluster by ensuring that the old primary is down.

 

Q: I'm not familiar with what the basic Postgres offers in the way of HA. How is the EDB PFM better than the basic PG DB? What is the cost of EDB PFM?

A:  EDB Postgres Failover Manager adds the capability of automating switchover/failover in Postgres streaming replication. Automatic switchover/failover capabilities are not built-in Postgres.

 

Q: Once that you set up a hot standby, what are the steps to cleanly remove a replica?

A: Depending on how you have set up the hot-standby, the clean removal process of the replica will be different. For example, if you are using replication slots for hot-standby, then you need to shutdown the hot-standby, remove the data directory from the standby server and also drop the replication slot from the primary. If you are not using a replication slot, then you can shut down the hot-standby and remove the data directory. In case you also want to remove the Postgres software from the server, then you would like to use the package remove command supported by the OS. For example, on CentOS/RHEL you can use, yum remove command

 

Q: Does EDB provide monitoring support via Grafana + Prometheus?

A: EDB has Prometheus exporter for PostgreSQL server metrics using jack’s pgx, which can help you to monitor Postgreusing Grafana + Prometheus. For more information please reach out to EDB Sales or support

 

Q: Is the group commit on primary crash-safe when using asynchronous streaming replication? E.g. if in such configuration primary server suddenly reboots - could it be standby WAL is ahead of primary's WAL and the restore of replication will require the full resync.

A: If I understand correctly you want to know how would you connect the old primary back to the HA cluster. Depending on how long your old primary was down after the failover, you can either one of the following option:

  1. Using the pg_rewind tool: Using the pg_rewind you can re-connect old primary as a standby to the existing HA cluster. For more information, please refer to the following link: https://www.postgresql.org/docs/12/app-pgrewind.html
  2. Using rsync/cp: Using rsync, you can re-sync the old primary with the primary and can add it as a standby by following the steps for configuring a standby.
  3.  Using backup: you can use backup from EDB Backup And Recovery Tool to restore the backup on the old primary. After restore you can configure the old primary to be a standby in the existing HA cluster. For more information on EDB Backup and Recovery Tool, please refer to the following: https://www.enterprisedb.com/products/postgresql-database-auto-backup-recovery-restore-script-tools