Streaming Replication: A Step-By-Step Setup Guide

Manish Yadav

October 12, 2022

Streaming replication is a core utility of PostgreSQL introduced in version 9.0. Streaming replication allows a standby server to stay more up-to-date than is possible with file-based log shipping. The standby connects to the primary, which streams WAL records to the standby as they're generated, without waiting for the WAL file to be filled. PostgreSQL supports two modes of streaming replication: asynchronous and synchronous mode.

In this blog, we’ll discuss how to set up streaming replication, while highlighting some key capabilities that will make it easier to do so at an enterprise level, efficiently.

How streaming replication works

When we start the standby instance, it begins by restoring all WAL available in the archive location, calling restore_command, if configured in the standby’s recovery configuration. Once it reaches the end of WAL available there and restore_command fails, the standby tries to restore any WAL available in the pg_wal directory. If that fails, and streaming replication has been configured, the standby tries to connect to the primary server and start streaming WAL from the last valid record found in archive or pg_wal.

Streaming replication requires that the Operating System and PostgreSQL/EPAS (EDB Postgres Advanced Server) versions should be the same across both primary and standby servers. There are a few changes in approach since PostgreSQL version 12 onwards.

To demonstrate the setting up of streaming replication in asynchronous mode(default), we are using the below environment:

Operating System	CentOS Linux release 7.9
PostgreSQL version	PostgreSQL 14.1
	IP Address	Port
Primary Server	192.168.57.101	5432
Standby Server	192.168.57.102	5432

Setting up Streaming Replication

Steps to be performed on Primary database instance

1: Review the below parameter settings

(a) Changes required in postgresql.conf

listen_addresses = '*'

archive_mode = on

max_wal_senders = 5

max_wal_size = 10GB

wal_level = replica

hot_standby = on

archive_command = 'rsync -a %p /opt/pg_archives/%f'

You need to create /opt/pg_archives folder and set ownership to PostgreSQL superuser i.e. postgres in this case.

(b) Changes required in pg_hba.conf
Set up authentication on the primary server to allow replication connections from the standby server(s).

host replication all 192.168.57.102/32 trust

2: Reload/restart the PostgreSQL database instance

[root@pg ~]# systemctl restart postgresql-14.service

[root@pg ~]# su postgres

bash-4.2$ /usr/pgsql-14/bin/pg_ctl -D /var/lib/pgsql/14/data/ restart -mf

Steps to be performed on Standby database instance

Follow any one option from the below methods to create an environment for setting up standby database instance.

Option 1

Create a new data directory and set up the ownership and permissions. In this case, we need to define Postgres Service manually.

[root@pg ~]mkdir -p /var/lib/pgsql/14/data/

[root@pg ~]chown -R postgres: /var/lib/pgsql/14/data/

[root@pg ~]chmod 700 /var/lib/pgsql/14/data/

Change data_directory location in /lib/systemd/system/postgresql-14.service file

Environment=PGDATA=/var/lib/pgsql/14/data/

Execute below command to implement the changes

[root@pg ~]# systemctl daemon-reload

Option 2

Initialize the database instance and remove the entire content from the data directory

[root@pg ~]# su postgres

bash-4.2$ /usr/pgsql-14/bin/initdb -D /var/lib/pgsql/14/data/

bash-4.2$ cd /var/lib/pgsql/14/data/

bash-4.2$ ls

base pg_commit_ts pg_hba.conf pg_logical pg_notify pg_serial pg_stat pg_subtrans pg_twophase pg_wal postgresql.auto.conf global pg_dynshmem pg_ident.conf pg_multixact pg_replslot pg_snapshots pg_stat_tmp pg_tblspc PG_VERSION pg_xact postgresql.conf

bash-4.2$ rm -rf *

After creation of requisite data directory, initiate taking backup using pg_basebackup command (to be executed on the standby server)

bash-4.2$ /usr/pgsql-14/bin/pg_basebackup -D /var/lib/pgsql/14/data/ -h 192.168.57.101 -p 5432 -Xs -R -P

-D = data directory

-h = IP address of primary server

-p = Port on which primary instance is running

-Xs = WAL method - stream

-P = Progress information

--slot=SLOTNAME #optional

-R = Write configuration parameters for replication

#primary_conninfo will automatically be defined by the pg_basebackup command. Verify the same in postgresql.auto.conf

In case of PostgreSQL version 12 and above, create a blank standby.signal file in data directory location

bash-4.2$ cd /var/lib/pgsql/14/data/

bash-4.2$touch standby.signal

Define the below parameters in postgresql.conf file

restore_command = 'rsync -a postgres@192.168.57.101:/opt/pg
_archives/%f %p'

recovery_target_timeline = 'latest'

In case of PostgreSQL version 11 and below, create recovery.conf file in data directory and include following parameters:

standby_mode = on

primary_conninfo = 'host=192.168.57.101 port=5432'

restore_command = 'rsync -a postgres@192.168.57.101:/opt/pg
_archives/%f %p'

trigger_file = '/tmp/makeprimary.trigger'

recovery_target_timeline = 'latest'

Start the PostgreSQL database instance

[root@pg ~]# systemctl start postgresql-14.service

Verify the status on primary database instance

postgres=#\x

postgres=# select * from pg_stat_replication;

-[ RECORD 1 ]----+---------------------------------

pid	\|	5798
usesysid	\|	10
usename	\|	postgres
application_name	\|	walreceiver
client_addr	\|	192.168.57.102
client_hostname	\|
client_port	\|	39764
backend_start	\|	2022-01-10 14:31:31.486763+05:30
backend_xmin	\|
state	\|	streaming
sent_lsn	\|	0/3000060
write_lsn	\|	0/3000060
flush_lsn	\|	0/3000060
replay_lsn	\|	0/3000060
write_lag	\|
flush_lag	\|
replay_lag	\|
sync_priority	\|	0
sync_state	\|	async
reply_time	\|	2022-01-10 14:32:11.610892+05:30

Verify the status of the standby database

postgres=# select * from pg_stat_wal_receiver ;

-[ RECORD 1 ]----+---------------------------------

pid	\|	23992
status	\|	streaming
receive_start_lsn	\|	0/3000000
receive_start_tli	\|	1
written_lsn	\|	0/3000148
flushed_lsn	\|	0/3000148
received_tli	\|	1
last_msg_send_time	\|	2022-01-10 18:21:04.093247+05:30
last_msg_receipt_time	\|	2022-01-10 18:21:04.093403+05:30
latest_end_lsn	\|	0/3000148
latest_end_time	\|	2022-01-10 17:47:29.370201+05:30
slot_name	\|
sender_host	\|	192.168.57.101
sender_port	\|	5432
conninfo	\|	user=postgres passfile=/var/lib/pgsql/.pgpass channel_binding=prefer dbname=replication host=192.168.57.101 port=5432 fallback_application_name=walreceiver sslmode=prefer sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any

So far we have discussed the basic setup of streaming replication, however, for mission critical/enterprise level setup we need to consider the below features as well.

Setting up archive_command to include remote server locations

The archive location should be accessible from the standby even when the primary is unavailable i.e. WALs should reside on the standby server itself or another trusted server, not on the master server. For including remote server location, there is a need to set up password less authentication for continuous shipping of WALs from primary database server to the remote server location.

Setup password less authentication at postgres user level

[root@pg ~]# su - postgres

-bash-4.2$ ssh-keygen

Generating public/private rsa key pair.

Enter file in which to save the key (/var/lib/postgres/.ssh/id_rsa):

Created directory '/var/lib/postgres/.ssh'.

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /var/lib/postgres/.ssh/id_rsa.

Your public key has been saved in /var/lib/postgres/.ssh/id_rsa.pub.

The key fingerprint is:

SHA256:82H1jinB3u+uv6VxCxdZ5RLk7pnM+LszhENdLNyIs88 postgres@pg

The key's randomart image is:

+---[RSA 2048]----+

| +o+.|

| o.+o=|

| .+ooo|

| . .oo..o|

| S =. +oo |

| = +oOEo.|

| + =+O.o|

| . o==.|

| .=@B |

+----[SHA256]-----+

-bash-4.2$ ssh-copy-id 192.168.57.102

when prompted provide the OS level postgres user password of 192.168.57.102 server.

Make the below changes in postgresql.conf to include the remote archive location

archive_command = 'rsync -a %p /opt/pg_archives/%f && rsync -a %p postgres@192.168.57.102:/opt/pg_archives/%f’

Reload the postgres instance to implement the above change

[root@pg ~]# systemctl reload postgresql-14.service

postgres=# SELECT pg_reload_conf();

Setting up automatic removal of obsolete WALs

pg_archivecleanup is used to automatically clean up WAL file archives when running as a standby server. This minimizes the number of WAL files that need to be retained, while preserving crash-restart capability. The below parameter needs to be included in the postgresql.conf file on the standby server.

archive_cleanup_command = 'pg_archivecleanup /opt/pg_archives/ %r'

Setting up Replication slots for standby database

Replication slots provide an automated way to ensure that the master does not remove WAL segments until they have been received by all standbys, and that the master does not remove rows which could cause a recovery conflict even when the standby is disconnected. However, with replication slots there is a need to be vigilant on sufficient space availability on the primary server since non availability of standby database instances would lead to accumulation of WALs on the primary server.

Creation of replication slot while executing pg_basebackup

bash-4.2$ /usr/pgsql-14/bin/pg_basebackup -D /var/lib/pgsql/14/data/ -h 192.168.57.101 -p 5432 -Xs -R -P

--slot=’streaming_slot’

Alternatively, in case we need to include replication_slot later, create the same manually on the primary server and include the parameter in recovery.conf in the standby server.

postgres=# SELECT * FROM pg_create_physical_replication_slot('streaming_slot');

slot_name | lsn

-------------+-----

streaming_slot |

To configure the standby to use this slot, primary_slot_name parameter needs to be defined on the standby database instance and needs to be restarted.

primary_slot_name = 'streaming_slot'

Setting up streaming replication in Synchronous mode

Synchronous replication offers the ability to confirm that all changes made by a transaction have been transferred to one or more synchronous standby servers. When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the write-ahead log on disk of both the primary and standby server.

Define the cluster_name of standby instance

cluster_name = 's1'

Once streaming replication has been configured, configuring synchronous replication requires setting up synchronous_standby_names to a non-empty value.

synchronous_commit = on #default

#Other values(local, remote_write, remote_apply)

synchronous_standby_names = ‘s1’

Important: if only a single sync standby instance has been configured and the same is not available due to any reason then all the transactions on primary will be stuck waiting for confirmation to commit at the standby instance. To mitigate the issue, there needs to be more than one synchronous standby. Also, Failover Manager (EFM) may be considered to automatically reduce the minimum required standby based on business requirements.

synchronous_standby_names = 'ANY 1 (s1, s2)'

The method ANY specifies a quorum-based synchronous replication and makes transaction commits wait until their WAL records are replicated to at least the requested number of synchronous standbys in the list.

synchronous_standby_names = 'FIRST 2 (s1, s2, s3)'

The method FIRST specifies a priority-based synchronous replication and makes transaction commits wait until their WAL records are replicated to the requested number of synchronous standbys chosen based on their priorities. It specifies a priority-based synchronous replication and makes transaction commits wait until their WAL records are replicated to the requested number of synchronous standbys chosen based on their priorities.

Once synchronous streaming replication has been set up, we can verify the same using the below query

postgres=# select application_name, client_addr,state, sysnc_state, reply_time from pg_stat_replication;
-[ RECORD 1 ]----+---------------------------------
application_name	s1
client_addr	192.168.57.102
state	streaming
sync_state	sync
reply_time	2022-01-17 15:21:58.457517+05:30

Standby database instances could be used for off loading SELECT queries from a primary instance which may help in load balancing and primary database has lesser load. Pgpool may also be used for load balancing purposes i.e. write queries on primary instances and Read (SELECT) queries to be routed on standby instances. However, there is a need to do necessary testing prior to deployment in the production environment.

Making the most of Streaming Replication

We hope this guide provided you with a foundation for how to set up streaming replication, and why it matters. With streaming replication in place, your PostgreSQL database can stay up-to-date and your business can make the most of your data and environment.

Resource Feature Callout 1

How to Set Up Streaming Replication to Keep Your PostgreSQL Database Performant and Up-to-Date

Manish Yadav

How streaming replication works

Setting up Streaming Replication

Steps to be performed on Primary database instance

Steps to be performed on Standby database instance

Option 1

Option 2

Setting up archive_command to include remote server locations

Setting up automatic removal of obsolete WALs

Setting up Replication slots for standby database

Setting up streaming replication in Synchronous mode

Making the most of Streaming Replication

Relevant Blogs

Next Generation Routing in EDB Postgres Distributed 5.0

How to manage Replication and Failover in Postgres Version 12 without recovery.conf file

PG Phriday: Isolating Postgres with repmgr

More Blogs

Switchover/Switchback Drill for Validating Postgres High Availability

Postgres Build 2020 Session Spotlight: True Postgres High Availability by Nibble IT

Logical Decoding of Large In-Progress Transactions in PostgreSQL