Streaming replication is a core utility of PostgreSQL introduced in version 9.0. Streaming replication allows a standby server to stay more up-to-date than is possible with file-based log shipping. The standby connects to the primary, which streams WAL records to the standby as they're generated, without waiting for the WAL file to be filled. PostgreSQL supports two modes of streaming replication: asynchronous and synchronous mode.
In this blog, we’ll discuss how to set up streaming replication, while highlighting some key capabilities that will make it easier to do so at an enterprise level, efficiently.
How streaming replication works
When we start the standby instance, it begins by restoring all WAL available in the archive location, calling restore_command, if configured in the standby’s recovery configuration. Once it reaches the end of WAL available there and restore_command fails, the standby tries to restore any WAL available in the pg_wal directory. If that fails, and streaming replication has been configured, the standby tries to connect to the primary server and start streaming WAL from the last valid record found in archive or pg_wal.
Streaming replication requires that the Operating System and PostgreSQL/EPAS (EDB Postgres Advanced Server) versions should be the same across both primary and standby servers. There are a few changes in approach since PostgreSQL version 12 onwards.
To demonstrate the setting up of streaming replication in asynchronous mode(default), we are using the below environment:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Setting up Streaming Replication
Steps to be performed on Primary database instance
1: Review the below parameter settings
(a) Changes required in postgresql.conf
listen_addresses = '*'
archive_mode = on
max_wal_senders = 5
max_wal_size = 10GB
wal_level = replica
hot_standby = on
archive_command = 'rsync -a %p /opt/pg_archives/%f'
You need to create /opt/pg_archives folder and set ownership to PostgreSQL superuser i.e. postgres in this case.
(b) Changes required in pg_hba.conf
Set up authentication on the primary server to allow replication connections from the standby server(s).
host replication all 192.168.57.102/32 trust
2: Reload/restart the PostgreSQL database instance
[root@pg ~]# systemctl restart postgresql-14.service
OR
[root@pg ~]# su postgres
bash-4.2$ /usr/pgsql-14/bin/pg_ctl -D /var/lib/pgsql/14/data/ restart -mf
Steps to be performed on Standby database instance
Follow any one option from the below methods to create an environment for setting up standby database instance.
Option 1
Create a new data directory and set up the ownership and permissions. In this case, we need to define Postgres Service manually.
[root@pg ~]mkdir -p /var/lib/pgsql/14/data/
[root@pg ~]chown -R postgres: /var/lib/pgsql/14/data/
[root@pg ~]chmod 700 /var/lib/pgsql/14/data/
Change data_directory location in /lib/systemd/system/postgresql-14.service file
Environment=PGDATA=/var/lib/pgsql/14/data/
Execute below command to implement the changes
[root@pg ~]# systemctl daemon-reload
Option 2
Initialize the database instance and remove the entire content from the data directory
[root@pg ~]# su postgres
bash-4.2$ /usr/pgsql-14/bin/initdb -D /var/lib/pgsql/14/data/
bash-4.2$ cd /var/lib/pgsql/14/data/
bash-4.2$ ls
base pg_commit_ts pg_hba.conf pg_logical pg_notify pg_serial pg_stat pg_subtrans pg_twophase pg_wal postgresql.auto.conf global pg_dynshmem pg_ident.conf pg_multixact pg_replslot pg_snapshots pg_stat_tmp pg_tblspc PG_VERSION pg_xact postgresql.conf
bash-4.2$ rm -rf *
After creation of requisite data directory, initiate taking backup using pg_basebackup command (to be executed on the standby server)
bash-4.2$ /usr/pgsql-14/bin/pg_basebackup -D /var/lib/pgsql/14/data/ -h 192.168.57.101 -p 5432 -Xs -R -P
-D = data directory
-h = IP address of primary server
-p = Port on which primary instance is running
-Xs = WAL method - stream
-P = Progress information
--slot=SLOTNAME #optional
-R = Write configuration parameters for replication
#primary_conninfo will automatically be defined by the pg_basebackup command. Verify the same in postgresql.auto.conf
In case of PostgreSQL version 12 and above, create a blank standby.signal file in data directory location
bash-4.2$ cd /var/lib/pgsql/14/data/
bash-4.2$touch standby.signal
Define the below parameters in postgresql.conf file
restore_command = 'rsync -a postgres@192.168.57.101:/opt/pg
_archives/%f %p'
recovery_target_timeline = 'latest'
In case of PostgreSQL version 11 and below, create recovery.conf file in data directory and include following parameters:
standby_mode = on
primary_conninfo = 'host=192.168.57.101 port=5432'
restore_command = 'rsync -a postgres@192.168.57.101:/opt/pg
_archives/%f %p'
trigger_file = '/tmp/makeprimary.trigger'
recovery_target_timeline = 'latest'
Start the PostgreSQL database instance
[root@pg ~]# systemctl start postgresql-14.service
Verify the status on primary database instance
postgres=#\x
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+---------------------------------
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Verify the status of the standby database
postgres=# select * from pg_stat_wal_receiver ;
-[ RECORD 1 ]----+---------------------------------
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
So far we have discussed the basic setup of streaming replication, however, for mission critical/enterprise level setup we need to consider the below features as well.
Setting up archive_command to include remote server locations
The archive location should be accessible from the standby even when the primary is unavailable i.e. WALs should reside on the standby server itself or another trusted server, not on the master server. For including remote server location, there is a need to set up password less authentication for continuous shipping of WALs from primary database server to the remote server location.
Setup password less authentication at postgres user level
[root@pg ~]# su - postgres
-bash-4.2$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/var/lib/postgres/.ssh/id_rsa):
Created directory '/var/lib/postgres/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /var/lib/postgres/.ssh/id_rsa.
Your public key has been saved in /var/lib/postgres/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:82H1jinB3u+uv6VxCxdZ5RLk7pnM+LszhENdLNyIs88 postgres@pg
The key's randomart image is:
+---[RSA 2048]----+
| +o+.|
| o.+o=|
| .+ooo|
| . .oo..o|
| S =. +oo |
| = +oOEo.|
| + =+O.o|
| . o==.|
| .=@B |
+----[SHA256]-----+
-bash-4.2$ ssh-copy-id 192.168.57.102
when prompted provide the OS level postgres user password of 192.168.57.102 server.
Make the below changes in postgresql.conf to include the remote archive location
archive_command = 'rsync -a %p /opt/pg_archives/%f && rsync -a %p postgres@192.168.57.102:/opt/pg_archives/%f’
Reload the postgres instance to implement the above change
[root@pg ~]# systemctl reload postgresql-14.service
OR
postgres=# SELECT pg_reload_conf();
Setting up automatic removal of obsolete WALs
pg_archivecleanup is used to automatically clean up WAL file archives when running as a standby server. This minimizes the number of WAL files that need to be retained, while preserving crash-restart capability. The below parameter needs to be included in the postgresql.conf file on the standby server.
archive_cleanup_command = 'pg_archivecleanup /opt/pg_archives/ %r'
Setting up Replication slots for standby database
Replication slots provide an automated way to ensure that the master does not remove WAL segments until they have been received by all standbys, and that the master does not remove rows which could cause a recovery conflict even when the standby is disconnected. However, with replication slots there is a need to be vigilant on sufficient space availability on the primary server since non availability of standby database instances would lead to accumulation of WALs on the primary server.
Creation of replication slot while executing pg_basebackup
bash-4.2$ /usr/pgsql-14/bin/pg_basebackup -D /var/lib/pgsql/14/data/ -h 192.168.57.101 -p 5432 -Xs -R -P
--slot=’streaming_slot’
Alternatively, in case we need to include replication_slot later, create the same manually on the primary server and include the parameter in recovery.conf in the standby server.
postgres=# SELECT * FROM pg_create_physical_replication_slot('streaming_slot');
slot_name | lsn
-------------+-----
streaming_slot |
To configure the standby to use this slot, primary_slot_name parameter needs to be defined on the standby database instance and needs to be restarted.
primary_slot_name = 'streaming_slot'
Setting up streaming replication in Synchronous mode
Synchronous replication offers the ability to confirm that all changes made by a transaction have been transferred to one or more synchronous standby servers. When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the write-ahead log on disk of both the primary and standby server.
Define the cluster_name of standby instance
cluster_name = 's1'
Once streaming replication has been configured, configuring synchronous replication requires setting up synchronous_standby_names to a non-empty value.
synchronous_commit = on #default
#Other values(local, remote_write, remote_apply)
synchronous_standby_names = ‘s1’
Important: if only a single sync standby instance has been configured and the same is not available due to any reason then all the transactions on primary will be stuck waiting for confirmation to commit at the standby instance. To mitigate the issue, there needs to be more than one synchronous standby. Also, Failover Manager (EFM) may be considered to automatically reduce the minimum required standby based on business requirements.
synchronous_standby_names = 'ANY 1 (s1, s2)'
The method ANY specifies a quorum-based synchronous replication and makes transaction commits wait until their WAL records are replicated to at least the requested number of synchronous standbys in the list.
OR
synchronous_standby_names = 'FIRST 2 (s1, s2, s3)'
The method FIRST specifies a priority-based synchronous replication and makes transaction commits wait until their WAL records are replicated to the requested number of synchronous standbys chosen based on their priorities. It specifies a priority-based synchronous replication and makes transaction commits wait until their WAL records are replicated to the requested number of synchronous standbys chosen based on their priorities.
Once synchronous streaming replication has been set up, we can verify the same using the below query
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Standby database instances could be used for off loading SELECT queries from a primary instance which may help in load balancing and primary database has lesser load. Pgpool may also be used for load balancing purposes i.e. write queries on primary instances and Read (SELECT) queries to be routed on standby instances. However, there is a need to do necessary testing prior to deployment in the production environment.
Making the most of Streaming Replication
We hope this guide provided you with a foundation for how to set up streaming replication, and why it matters. With streaming replication in place, your PostgreSQL database can stay up-to-date and your business can make the most of your data and environment.