How to Set Up Streaming Replication to Keep Your PostgreSQL Database Performant and Up-to-Date

October 12, 2022

Streaming replication is a core utility of PostgreSQL introduced in version 9.0.  Streaming replication allows a standby server to stay more up-to-date than is possible with file-based log shipping. The standby connects to the primary, which streams WAL records to the standby as they're generated, without waiting for the WAL file to be filled. PostgreSQL supports two modes of streaming replication: asynchronous and synchronous mode.

In this blog, we’ll discuss how to set up streaming replication, while highlighting some key capabilities that will make it easier to do so at an enterprise level, efficiently.

 

How streaming replication works

When we start the standby instance, it begins by restoring all WAL available in the archive location, calling restore_command, if configured in the standby’s recovery configuration. Once it reaches the end of WAL available there and restore_command fails, the standby tries to restore any WAL available in the pg_wal directory. If that fails, and streaming replication has been configured, the standby tries to connect to the primary server and start streaming WAL from the last valid record found in archive or pg_wal.

Streaming replication requires that the Operating System and PostgreSQL/EPAS (EDB Postgres Advanced Server) versions should be the same across both primary and standby servers. There are a few changes in approach since PostgreSQL version 12 onwards.  

To demonstrate the setting up of streaming replication in asynchronous mode(default), we are using the below environment:

Operating System

CentOS Linux release 7.9

PostgreSQL version

PostgreSQL 14.1

 

IP Address

Port

Primary Server

192.168.57.101

5432

Standby Server

192.168.57.102

5432

 

Setting up Streaming Replication

Steps to be performed on Primary database instance

1: Review the below parameter settings 

(a) Changes required in postgresql.conf

listen_addresses = '*'

archive_mode = on

max_wal_senders = 5 

max_wal_size = 10GB    

wal_level = replica

hot_standby = on   

archive_command = 'rsync -a %p /opt/pg_archives/%f'


You need to create /opt/pg_archives folder and set ownership to PostgreSQL superuser  i.e. postgres in this case.

(b) Changes required in pg_hba.conf
Set up authentication on the primary server to allow replication connections from the standby server(s).
  

host    replication all   192.168.57.102/32   trust

2: Reload/restart the PostgreSQL database instance 

[root@pg ~]# systemctl restart postgresql-14.service

OR

[root@pg ~]# su postgres

bash-4.2$ /usr/pgsql-14/bin/pg_ctl -D /var/lib/pgsql/14/data/ restart -mf


Steps to be performed on Standby database instance

Follow any one option from the below methods to create an environment for setting up standby database instance. 

Option 1

Create a new data directory and set up the ownership and permissions.  In this case, we need to define Postgres Service manually.

[root@pg ~]mkdir -p /var/lib/pgsql/14/data/

[root@pg ~]chown -R postgres: /var/lib/pgsql/14/data/

[root@pg ~]chmod 700 /var/lib/pgsql/14/data/

Change data_directory location in /lib/systemd/system/postgresql-14.service file

Environment=PGDATA=/var/lib/pgsql/14/data/

Execute below command to implement the changes

[root@pg ~]# systemctl daemon-reload

 

Option 2

Initialize the database instance and remove the entire content from the data directory

[root@pg ~]# su postgres

bash-4.2$ /usr/pgsql-14/bin/initdb -D /var/lib/pgsql/14/data/

bash-4.2$ cd /var/lib/pgsql/14/data/

bash-4.2$ ls

base pg_commit_ts  pg_hba.conf    pg_logical    pg_notify pg_serial     pg_stat   pg_subtrans pg_twophase  pg_wal   postgresql.auto.conf global pg_dynshmem   pg_ident.conf  pg_multixact  pg_replslot pg_snapshots  pg_stat_tmp  pg_tblspc PG_VERSION   pg_xact  postgresql.conf

bash-4.2$ rm -rf *


After creation of requisite data directory, initiate taking backup using pg_basebackup command (to be executed on the standby server) 

bash-4.2$ /usr/pgsql-14/bin/pg_basebackup -D /var/lib/pgsql/14/data/ -h 192.168.57.101 -p 5432 -Xs -R -P

-D = data directory

-h  = IP address of primary server

-p = Port on which primary instance is running

-Xs = WAL method - stream

-P = Progress information

--slot=SLOTNAME  #optional

-R = Write configuration parameters for replication

#primary_conninfo will automatically be defined by the pg_basebackup command.  Verify the same in postgresql.auto.conf


In case of PostgreSQL version 12 and above, create a blank standby.signal file in data directory location

bash-4.2$ cd /var/lib/pgsql/14/data/

bash-4.2$touch standby.signal

Define the below parameters  in postgresql.conf file

restore_command = 'rsync -a  postgres@192.168.57.101:/opt/pg
_archives/%f %p' 

recovery_target_timeline = 'latest'

In case of PostgreSQL version 11 and below, create recovery.conf file in data directory and include following parameters:

standby_mode = on

primary_conninfo = 'host=192.168.57.101 port=5432' 

restore_command = 'rsync -a  postgres@192.168.57.101:/opt/pg
_archives/%f %p' 

trigger_file = '/tmp/makeprimary.trigger'

recovery_target_timeline = 'latest'


Start the PostgreSQL database instance 


[root@pg ~]# systemctl start postgresql-14.service


Verify the status on primary database instance

 

postgres=#\x

postgres=# select * from pg_stat_replication;

-[ RECORD 1 ]----+---------------------------------

pid              

|

5798

usesysid         

|

10

usename          

|

postgres

application_name 

|

walreceiver

client_addr      

|

192.168.57.102

client_hostname  

|

 

client_port      

|

39764

backend_start    

|

2022-01-10 14:31:31.486763+05:30

backend_xmin     

|

 

state            

|

streaming

sent_lsn         

|

0/3000060

write_lsn        

|

0/3000060

flush_lsn        

|

0/3000060

replay_lsn       

|

0/3000060

write_lag        

|

 

flush_lag        

|

 

replay_lag       

|

 

sync_priority    

|

0

sync_state       

|

async

reply_time       

|

2022-01-10 14:32:11.610892+05:30

 

Verify the status of the standby database

 

postgres=# select * from pg_stat_wal_receiver ;

-[ RECORD 1 ]----+---------------------------------

pid       

|

23992

status    

|

streaming

receive_start_lsn     

|

0/3000000

receive_start_tli     

|

1

written_lsn           

|

0/3000148

flushed_lsn           

|

0/3000148

received_tli          

|

1

last_msg_send_time    

|

2022-01-10 18:21:04.093247+05:30

last_msg_receipt_time 

|

2022-01-10 18:21:04.093403+05:30

latest_end_lsn        

|

0/3000148

latest_end_time       

|

2022-01-10 17:47:29.370201+05:30

slot_name          

    

|

 

sender_host           

|

192.168.57.101

sender_port           

|

5432

conninfo           

    

|

user=postgres passfile=/var/lib/pgsql/.pgpass channel_binding=prefer dbname=replication host=192.168.57.101 port=5432 fallback_application_name=walreceiver sslmode=prefer sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any

 

So far we have discussed the basic setup of streaming replication, however, for mission critical/enterprise level setup we need to consider the below features as well.

 

Setting up archive_command to include remote server locations

The archive location should be accessible from the standby even when the primary is unavailable i.e. WALs should reside on the standby server itself or another trusted server, not on the master server.  For including remote server location, there is a need to set up password less authentication for continuous shipping of WALs from primary database server to the remote server location.

Setup password less authentication at postgres user level

[root@pg ~]# su - postgres

-bash-4.2$ ssh-keygen 

Generating public/private rsa key pair.

Enter file in which to save the key (/var/lib/postgres/.ssh/id_rsa): 

Created directory '/var/lib/postgres/.ssh'.

Enter passphrase (empty for no passphrase): 

Enter same passphrase again: 

Your identification has been saved in /var/lib/postgres/.ssh/id_rsa.

Your public key has been saved in /var/lib/postgres/.ssh/id_rsa.pub.

The key fingerprint is:

SHA256:82H1jinB3u+uv6VxCxdZ5RLk7pnM+LszhENdLNyIs88 postgres@pg

The key's randomart image is:

+---[RSA 2048]----+

|             +o+.|

|            o.+o=|

|            .+ooo|

|         . .oo..o|

|        S =. +oo |

|         = +oOEo.|

|          + =+O.o|

|           . o==.|

|            .=@B |

+----[SHA256]-----+

 

-bash-4.2$ ssh-copy-id 192.168.57.102

when prompted provide the OS level postgres user password of 192.168.57.102 server.

Make the below changes in postgresql.conf to include the remote archive location

archive_command = 'rsync -a %p /opt/pg_archives/%f && rsync -a %p postgres@192.168.57.102:/opt/pg_archives/%f’

Reload the postgres instance to implement the above change

[root@pg ~]# systemctl reload postgresql-14.service

OR

postgres=# SELECT pg_reload_conf();

 

Setting up automatic removal of obsolete WALs

pg_archivecleanup is used to automatically clean up WAL file archives when running as a standby server. This minimizes the number of WAL files that need to be retained, while preserving crash-restart capability.  The below parameter needs to be included in the postgresql.conf file on the standby server.

archive_cleanup_command = 'pg_archivecleanup /opt/pg_archives/ %r'


Setting up Replication slots for standby database

Replication slots provide an automated way to ensure that the master does not remove WAL segments until they have been received by all standbys, and that the master does not remove rows which could cause a recovery conflict even when the standby is disconnected.  However, with replication slots there is a need to be vigilant on sufficient space availability on the primary server since non availability of standby database instances would lead to accumulation of WALs on the primary server.

Creation of replication slot while executing pg_basebackup 

bash-4.2$ /usr/pgsql-14/bin/pg_basebackup -D /var/lib/pgsql/14/data/ -h 192.168.57.101 -p 5432 -Xs -R -P

--slot=’streaming_slot’

Alternatively, in case we need to include replication_slot later, create the same manually on the primary server and include the parameter in recovery.conf in the standby server.

postgres=# SELECT * FROM pg_create_physical_replication_slot('streaming_slot');

  slot_name  | lsn

-------------+-----

 streaming_slot |

To configure the standby to use this slot, primary_slot_name parameter needs to be defined on the standby database instance and needs to be restarted. 

primary_slot_name = 'streaming_slot'

 

Setting up streaming replication in Synchronous mode 

Synchronous replication offers the ability to confirm that all changes made by a transaction have been transferred to one or more synchronous standby servers.  When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the write-ahead log on disk of both the primary and standby server.  

Define the cluster_name of standby instance

 cluster_name = 's1'

Once streaming replication has been configured, configuring synchronous replication requires setting up synchronous_standby_names to a non-empty value.

synchronous_commit = on   #default 

#Other values(local, remote_write, remote_apply)

synchronous_standby_names = ‘s1’

Important:  if only a single sync standby instance has been configured and the same is not available due to any reason then all the transactions on primary will be stuck waiting for confirmation to commit at the standby instance. To mitigate the issue, there needs to be more than one synchronous standby. Also, Failover Manager (EFM) may be considered to automatically reduce the minimum required standby based on business requirements.

synchronous_standby_names = 'ANY 1 (s1, s2)'

The method ANY specifies a quorum-based synchronous replication and makes transaction commits wait until their WAL records are replicated to at least the requested number of synchronous standbys in the list.

OR

synchronous_standby_names = 'FIRST 2 (s1, s2, s3)'

The method FIRST specifies a priority-based synchronous replication and makes transaction commits wait until their WAL records are replicated to the requested number of synchronous standbys chosen based on their priorities. It specifies a priority-based synchronous replication and makes transaction commits wait until their WAL records are replicated to the requested number of synchronous standbys chosen based on their priorities. 

Once synchronous streaming replication has been set up, we can verify the same using the below query

postgres=# select application_name, client_addr,state, sysnc_state, reply_time from pg_stat_replication;

-[ RECORD 1 ]----+---------------------------------

application_name 

s1

client_addr      

192.168.57.102

state            

streaming

sync_state       

sync

reply_time       

2022-01-17 15:21:58.457517+05:30

 

Standby database instances could be used for off loading SELECT queries  from a primary instance which may help in load balancing and primary database has lesser load.  Pgpool may also be used for load balancing purposes i.e. write queries on primary instances and Read (SELECT) queries to be routed on standby instances. However, there is a need to do necessary testing prior to deployment in the production environment.

 

Making the most of Streaming Replication

We hope this guide provided you with a foundation for how to set up streaming replication, and why it matters. With streaming replication in place, your PostgreSQL database can stay up-to-date and your business can make the most of your data and environment.
 

 

Share this

Relevant Blogs

PG Phriday: Isolating Postgres with repmgr

In my upcoming talk on The Do’s and Don’ts of Postgres High Availability at Postgres Build 2021, I bring up two extremely critical topics regarding Consensus and Fencing. The...
November 19, 2021

More Blogs