PostgreSQL Replication and Automatic Failover Tutorial

January 24, 2023

Table of Contents

1. What Is PostgreSQL Replication? 

2. What Is Automatic Failover in PostgreSQL? 

3. High Availability and Failover Replication 

4. Why Use PostgreSQL Replication? 

5. What Are the Models of PostgreSQL Database Replication (Single-Master & Multi-Master)? 

6. What Are the Classes of PostgreSQL Replication? 

7. What Are the Replication Modes in PostgreSQL Database? 

8. What Are the Types of PostgreSQL Database Replication? 

9. Introduction to Write-Ahead Logging (WAL) in PostgreSQL Database 

       9.1 What is Write-Ahead Logging (WAL) in PostgreSQL and Why Is It Required? 

10. What Are the Transaction Log and WAL Segment Files in PostgreSQL? 

11. What Is the WAL Writer in PostgreSQL? 

12. WAL Segment File Management 

      12.1 Where WAL segment files are stored? 

      12.2 What are the conditions when PostgreSQL switches to a new WAL segment file? 

13. Write-Ahead Logging (WAL) Example for PostgreSQL 

14. What Are The Replication Options Based on WAL in PostgreSQL? 

      14.1 Continuous WAL Archiving 

      14.2 Log Shipping Based Replication — File Level

      14.3 Log Shipping Based Replication — Block Level

      14.4 How long should the primary server retain WAL segment files? 

15. Log Shipping Based Replication 

      15.1 Physical Streaming Replication 

      15.2 What are the advantages of physical streaming replication? 

16. What Are WAL Sender and WAL Receiver in PostgreSQL? 

      16.1 WAL Streaming Protocol Details 

17. PostgreSQL Replication and Failover Over Setup 

18. What are the PostgreSQL Failover Steps?

19. How to Automate Failover and Replication in PostgreSQL 

20. repmgr for PostgreSQL

 

1. What Is PostgreSQL Replication?

The process of copying data from a PostgreSQL database server to another server is called PostgreSQL Replication. The source database server is usually called the Master server, whereas the database server receiving the copied data is called the Replica server.

PostgreSQL Replication

2. What Is Automatic Failover in PostgreSQL?

Once physical streaming replication has been set up and configured in PostgreSQL, failover can  take  place if the primary server  for the database fails. Failover is the term to describe the recovery process, which in PostgreSQL, can take some time, particularly as PostgreSQL itself does not provide built-in tools for detecting server failures. Fortunately, there are tools available that allow for Automatic Failover, which can help detect failures and automatically switch to the standby, minimizing database downtime. 

EnterpriseDB’s EDB Postgres Failover Manager lets you automatically detect database failures and promotes the most current  standby server as the new master, helping to avoid costly database downtime. EDB Failover Manager even provides fast, automatic failure detection

 

3. High Availability and Failover Replication

High Availability refers to database systems that are set up so that standby servers can take over quickly when the master or primary server fails. To achieve high availability, a database system should meet some key requirements: it should have redundancy to prevent single points of failure, reliable switchover mechanisms, and active monitoring to detect any failures that may occur.  Setting up failover replication provides the needed redundancy to allow for high availability by ensuring that standbys are available if the master or primary server ever goes down. 

 

4. Why Use PostgreSQL Replication?

Replication of data can have many uses:

  • OLTP Performance
  • Fault Tolerance
  • Data Migration
  • Testing Systems in Parallel

OLTP performance: Removing reporting query load from the online transaction processing (OLTP) system improves both reporting query time and transaction processing performance.

Fault tolerance: In the event of master database server failure, the replica server can take over, since it already contains the master server’s data. In this configuration the replica server is also called the standby server. This configuration can also be used for regular maintenance of the primary server.

Data migration: To upgrade database server hardware, or to deploy the same system for another customer.

Testing systems in parallel: When porting the application from one DBMS to another, the results on the same data from both the old and new systems must be compared to ensure that the new system works as expected.

EnterpriseDB’s EDB Postgres Replication Server makes managing Postgres replication easy, offering all the benefits replication provides.

 

5. What Are the Models of PostgreSQL Database Replication (Single-Master & Multi-Master)?

  • Single-Master Replication (SMR)
  • Multi-Master Replication (MMR)

In Single-Master Replication (SMR), changes to table rows in a designated master database server are replicated to one or more replica servers. The replicated tables in the replica database are not permitted to accept any changes (except from the master). But even if they do, changes are not replicated back to the master server.

In Multi-Master Replication (MMR), changes to table rows in more than one designated master database are replicated to their counterpart tables in every other master database. In this model conflict resolution schemes are often employed to avoid problems like duplicate primary keys.

PostgreSQL Database Replication

 

Multi-Master Replication (MMR) adds to the uses of replication:

  • Write availability and scalability.
  • Ability to employ a wide area network (WAN) of master databases that can be geographically close to groups of clients, yet maintain data consistency across the network.

 

6. What Are the Classes of PostgreSQL Replication?

  • Unidirectional Replication
  • Bidirectional Replication

Single-Master Replication is also called unidirectional, since replication data flows in one direction only, from master to replica. 

Multi-Master Replication data, on the other hand, flows in both directions, and it is therefore called bidirectional replication.

 

7. What Are the Replication Modes in PostgreSQL Database?

  • Asynchronous Mode of Replication
  • Synchronous Mode of Replication

In synchronous mode replication, transactions on the master database are declared complete only when those changes have been replicated to all the replicas. The replica servers must all be available all the time for the transactions to complete on the master.

Synchronous mode PostgreSQL replication

 

In asynchronous mode, transactions on the master server can be declared complete when the changes have been done on just the master server. These changes are then replicated to the replicas later in time. The replica servers can remain out-of-sync for a certain duration, which is called a replication lag.

Replication Modes in PostgreSQL Database

 

Synchronous and asynchronous modes both have their costs and benefits, and users will want to consider safety and performance when configuring their replication settings.

 

8. What Are the Types of PostgreSQL Database Replication?

  • Physical Replication of PostgreSQL Database
  • Logical Replication of PostgreSQL Database

Before we discuss physical and logical replication types, let’s review the context of the terms “physical” and “logical” here.

                

Logical Operation.                 

Physical Operation

1

initdb

Creates a base directory for the cluster

2

CREATE DATABASE

Creates a subdirectory in the base directory

3

CREATE TABLE

Creates a file within the subdirectory of the database

4

INSERT

Changes the file that was created for this particular table and writes new WAL records in the current WAL segment

 

For example:

ramp=# create table sample_tbl(a int, b varchar(255));

CREATE TABLE

ramp=# SELECT pg_relation_filepath('sample_tbl');

 pg_relation_filepath 

----------------------

 base/34740/706736

(1 row)

ramp=# SELECT datname, oid FROM pg_database WHERE datname = 'ramp';

 datname |  oid  

---------+-------

 ramp    | 34740

(1 row)

ramp=# SELECT relname, oid FROM pg_class WHERE relname = 'sample_tbl';

  relname   |  oid   

------------+--------

 sample_tbl | 706736

(1 row)

 

Physical replication deals with files and directories. It has no knowledge of what these files and directories represent. Physical replication is done at file system level or disk level.

Logical replication, on the other hand, deals with databases, tables, and DML operations. It is therefore possible in logical replication to replicate a certain set of tables only. Logical replication is done at the database cluster level.

 

9. Introduction to Write-Ahead Logging (WAL) in PostgreSQL Database

9.1 What is Write-Ahead Logging (WAL) in PostgreSQL and Why Is It Required?

In PostgreSQL, all changes made by a transaction are first saved in a log file, and then the result of the transaction is sent to the initiating client. Data files themselves are not changed on every transaction. This is a standard mechanism to prevent data loss in case of circumstances like operating system crash, hardware failure, or PostgreSQL crash. This mechanism is called Write Ahead Logging (WAL), and the log file is called Write Ahead Log.

Each change that the transaction performs (INSERT, UPDATE, DELETE, COMMIT) is written in the log as a WAL record. WAL records are first written into an in-memory WAL buffer. When the transaction is committed the records are written into a WAL segment file on the disk.

A WAL record’s Log Sequence Number (LSN) represents the location/position where the record is saved in the log file. LSN is used as a unique id of the WAL record. Logically, a transaction log is a file whose size is 2^64 bytes. LSN is therefore a 64-bit number represented as two 32-bit hexadecimal numbers separated by a /. For example:

select pg_current_wal_lsn();

 pg_current_wal_lsn 

--------------------

 0/2BDBBD0

(1 row)

 

If there is a system crash, the database can recover committed transactions from the WAL. The recovery starts from the last REDO point or checkpoint. A checkpoint is a point in the transaction log at which all data files have been updated to match the information in the log. The process of saving the WAL records from the log file to the actual data files is called check-pointing.

Let’s consider a case where the database crashes after two transactions that perform one insert each and WAL is used for recovery.

  1. Assume a checkpoint has been issued, which stores the location of the latest REDO point in the current WAL segment. This also flushes all dirty pages in the shared buffer pool to the disk. This action guarantees that WAL records before the REDO point are no longer needed for recovery, since all data has been flushed to the disk pages.
  2. The first INSERT statement is issued. The table’s page is loaded from disk to the buffer pool.
  3. A tuple is inserted into the loaded page.
  4. The WAL record of this insert is saved into the WAL buffer at location LSN_1.
  5. The page’s LSN is updated from LSN_0 to LSN_1, which identifies the WAL record for the last change to this page.
  6. The first COMMIT statement is issued.
  7. The WAL record of this commit action is written into the WAL buffer, and then all WAL records in the WAL buffer up to this page’s LSN are flushed to the WAL segment file.
  8. For the second INSERT and commit, steps 2 to 7 are repeated.

WAL in PostgreSQL

 

If there is an operating system crash, all the data on the shared buffer pool will be lost. However all modifications to the page have been written into the WAL segment files as history data. The following steps show how our database cluster can be restored back to the state immediately before the crash using WAL records. Nothing special needs to be done—PostgreSQL will automatically enter into recovery mode after restarting.

  1. PostgreSQL reads the WAL record of the first INSERT statement from the appropriate WAL segment file.
  2. PostgreSQL loads the table's page from the database cluster into the shared buffer pool.
  3. PostgreSQL compares the WAL record's LSN (LSN_1) with the page LSN (LSN_0). Since LSN_1 is greater than LSN_0, the tuple in the WAL record is inserted into the page and the page's LSN is updated to LSN_1.

The remaining WAL records are replayed in the similar manner.

WAL Records in PostgreSQL

 

10. What Are the Transaction Log and WAL Segment Files in PostgreSQL?

The PostgreSQL transaction log is a virtual file with a capacity of 8-byte length. Physically, the log is divided into 16-MB files, each of which is called a WAL segment.

The WAL segment file name is a 24-digit number whose naming rule is as follows:

WAL Segment File in PostgreSQL

 

Assuming that current time line ID is 0x00000001 the first WAL segment file names will be:

00000001 00000000 0000000

00000001 00000000 0000001

00000001 00000000 0000002

……….

00000001 00000001 0000000

00000001 00000001 0000001

00000001 00000001 0000002

…………

00000001 FFFFFFFF FFFFFFFD

00000001 FFFFFFFF FFFFFFFE

00000001 FFFFFFFF FFFFFFFF

 

For example:

select pg_walfile_name('0/2BDBBD0');

     pg_walfile_name      

--------------------------

 000000010000000000000002

 

11. What Is the WAL Writer in PostgreSQL?

The WAL writer is a background process that periodically checks the WAL buffer and writes any unwritten WAL records into the WAL segments. The WAL writer avoids bursts of IO activity and instead spans its process over time with little IO activity. The configuration parameter wal_writer_delay controls how often the WAL writer flushes the WAL, with a default value of 200 ms.

 

12. WAL Segment File Management

12.1 Where WAL segment files are stored?

WAL segment files are stored in the pg_wal subdirectory. 

 

12.2 What are the conditions when PostgreSQL switches to a new WAL segment file?

PostgreSQL switches to a new WAL segment file under the following conditions:

  • The WAL segment has been filled up.
  • The function pg_switch_wal has been issued.
  • archive_mode is enabled and the time set to archive_timeout has been exceeded. 

After they are switched out, WAL files can either be removed or recycled—i.e., renamed and reused for the future. The number of WAL files that the server would retain at any point in time depends on server configuration and server activity.

Whenever the checkpoint starts, PostgreSQL estimates and prepares the number of WAL segment files required for this checkpoint cycle. Such an estimate is made based on the numbers of files consumed in previous checkpoint cycles. They are counted from the segment that contains the prior REDO point, and the value is to be between min_wal_size (by default, 80 MB, or 5 files) and max_wal_size (1 GB, or 64 files). If a checkpoint starts, necessary files will be held and recycled, while the unnecessary ones will be removed.

An example is provided in the diagram below. Assume that there are six files before checkpoint starts, the previous REDO point is contained in the file WAL_3, and PostgreSQL estimates that five files are to be retained. In this case, WAL_1 will be renamed as WAL_7 for recycling, and WAL_2 will be removed.

WAL Segment File Management in PostgreSQL

 

13. Write-Ahead Logging (WAL) Example for PostgreSQL

Step 1:

SELECT datname, oid FROM pg_database WHERE datname = 'postgres';

datname  |  oid  

----------+-------

postgres | 15709

Note the database OID, i.e. 15709

Step 2:

SELECT oid,* from pg_tablespace;

oid  |  spcname   | spcowner | spcacl | spcoptions 

------+------------+----------+--------+------------

1663 | pg_default |       10 |        | 

1664 | pg_global  |       10 |        | 

Note the table space OID, i.e. 1663

Step 3:

SELECT pg_current_wal_lsn();

pg_current_wal_lsn 

--------------------

0/1C420B8

Note the LSN, i.e. 0/1C420B8

Step 4:

CREATE TABLE abc(a VARCHAR(10));

Step 5:

SELECT pg_relation_filepath('abc');

pg_relation_filepath 

----------------------

base/15709/16384

Note the relation filename, base/15709/16384

Step 6:

View the contents of the files at ./pg_waldump --path=/tmp/sd/pg_wal –start=0/1C420B8

Using the start LSN noted in step 3.

Note that the WAL contains the instruction to create physical file

15709 → database postgres → noted in step 1

16384 → table abc → noted in step 5

rmgr

Len(rec/tot)

tx

lsn

prev

desc

XLOG   

30/ 30

   0

0/01C420B8

0/01C42080

NEXTOID 24576

Storage

 42/ 42

   0

0/01C420D8

0/01C420B8

CREATE base/15709/16384

Heap   

203/203

1216

0/01C42108

0/01C420D8

INSERT off 2, blkref #0: rel 1663/15709/1247 blk 0

Btree  

 64/ 64

1216

0/01C421D8

0/01C42108

INSERT_LEAF off 298, blkref #0: rel 1663/15709/2703 blk 2

Btree  

 64/ 64

1216

0/01C42218

0/01C421D8

INSERT_LEAF off 7, blkref #0: rel 1663/15709/2704 blk 5

Heap   

 80/ 80

1216

0/01C42258

0/01C42218

INSERT off 30, blkref #0: rel 1663/15709/2608 blk 9

Btree  

 72/ 72

1216

0/01C422A8

0/01C42258

INSERT_LEAF off 243, blkref #0: rel 1663/15709/2673 blk 51

Btree  

 72/ 72

1216

0/01C422F0

0/01C422A8

INSERT_LEAF off 170, blkref #0: rel 1663/15709/2674 blk 61

Heap   

203/203

1216

0/01C42338

0/01C422F0

INSERT off 6, blkref #0: rel 1663/15709/1247 blk 1

Btree  

64/64

1216

0/01C42408

0/01C42338

INSERT_LEAF off 298, blkref #0: rel 1663/15709/2703 blk 2

Btree  

 72/ 72

1216

0/01C42448

0/01C42408

INSERT_LEAF off 3, blkref #0: rel 1663/15709/2704 blk 1

Heap   

 80/ 80

1216

0/01C42490

0/01C42448

INSERT off 36, blkref #0: rel 1663/15709/2608 blk 9

Btree  

 72/ 72

1216

0/01C424E0

0/01C42490

INSERT_LEAF off 243, blkref #0: rel 1663/15709/2673 blk 51

Btree  

 72/ 72

1216

0/01C42528

0/01C424E0

INSERT_LEAF off 97, blkref #0: rel 1663/15709/2674 blk 57

Heap   

199/199

1216

0/01C42570

0/01C42528

INSERT off 2, blkref #0: rel 1663/15709/1259 blk 0

Btree  

 64/ 64

1216

0/01C42638

0/01C42570

INSERT_LEAF off 257, blkref #0: rel 1663/15709/2662 blk 2

Btree  

 64/ 64

1216

0/01C42678

0/01C42638

INSERT_LEAF off 8, blkref #0: rel 1663/15709/2663 blk 1

Btree  

 64/ 64

1216

0/01C426B8

0/01C42678

INSERT_LEAF off 217, blkref #0: rel 1663/15709/3455 blk 5

Heap   

171/171

1216

0/01C426F8

0/01C426B8

INSERT off 53, blkref #0: rel 1663/15709/1249 blk 16

Btree  

 64/ 64

1216

0/01C427A8

0/01C426F8

INSERT_LEAF off 185, blkref #0: rel 1663/15709/2658 blk 25

Btree  

 64/ 64

1216

0/01C427E8

0/01C427A8

INSERT_LEAF off 194, blkref #0: rel 1663/15709/2659 blk 16

Heap   

171/171

1216

0/01C42828

0/01C427E8

INSERT off 54, blkref #0: rel 1663/15709/1249 blk 16

Btree  

 72/ 72

1216

0/01C428D8

0/01C42828

INSERT_LEAF off 186, blkref #0: rel 1663/15709/2658 blk 25

Btree  

 64/ 64

1216

0/01C42920

0/01C428D8

INSERT_LEAF off 194, blkref #0: rel 1663/15709/2659 blk 16

Heap   

171/171

1216

0/01C42960

0/01C42920

INSERT off 55, blkref #0: rel 1663/15709/1249 blk 16

Btree  

 72/ 72

1216

0/01C42A10

0/01C42960

INSERT_LEAF off 187, blkref #0: rel 1663/15709/2658 blk 25

Btree  

 64/ 64

1216

0/01C42A58

0/01C42A10

INSERT_LEAF off 194, blkref #0: rel 1663/15709/2659 blk 16

Heap   

171/171

1216

0/01C42A98

0/01C42A58

INSERT off 1, blkref #0: rel 1663/15709/1249 blk 17

Btree  

 72/ 72

1216

0/01C42B48

0/01C42A98

INSERT_LEAF off 186, blkref #0: rel 1663/15709/2658 blk 25

Btree  

 64/ 64

1216

0/01C42B90

0/01C42B48

INSERT_LEAF off 194, blkref #0: rel 1663/15709/2659 blk 16

Heap   

171/171

1216

0/01C42BD0

0/01C42B90

INSERT off 3, blkref #0: rel 1663/15709/1249 blk 17

Btree  

 72/ 72

1216

0/01C42C80

0/01C42BD0

INSERT_LEAF off 188, blkref #0: rel 1663/15709/2658 blk 25

Btree  

 64/ 64

1216

0/01C42CC8

0/01C42C80

INSERT_LEAF off 194, blkref #0: rel 1663/15709/2659 blk 16

Heap   

171/171

1216

0/01C42D08

0/01C42CC8

INSERT off 5, blkref #0: rel 1663/15709/1249 blk 17

Btree  

 72/ 72

1216

0/01C42DB8

0/01C42D08

INSERT_LEAF off 186, blkref #0: rel 1663/15709/2658 blk 25

Btree  

 64/ 64

1216

0/01C42E00

0/01C42DB8

INSERT_LEAF off 194, blkref #0: rel 1663/15709/2659 blk 16

Heap   

171/171

1216

0/01C42E40

0/01C42E00

INSERT off 30, blkref #0: rel 1663/15709/1249 blk 32

Btree  

 72/ 72

1216

0/01C42EF0

0/01C42E40

INSERT_LEAF off 189, blkref #0: rel 1663/15709/2658 blk 25

Btree  

 64/ 64

1216

0/01C42F38

0/01C42EF0

INSERT_LEAF off 194, blkref #0: rel 1663/15709/2659 blk 16

Heap   

 80/ 80

1216

0/01C42F78

0/01C42F38

INSERT off 25, blkref #0: rel 1663/15709/2608 blk 11

Btree  

 72/ 72

1216

0/01C42FC8

0/01C42F78

INSERT_LEAF off 131, blkref #0: rel 1663/15709/2673 blk 44

Btree  

 72/ 72

1216

0/01C43010

0/01C42FC8

INSERT_LEAF off 66, blkref #0: rel 1663/15709/2674 blk 46

Standby

 42/ 42

1216

0/01C43058

0/01C43010

LOCK xid 1216 db 15709 rel 16384 

Txn    

405/405

1216

0/01C43088

0/01C43058

COMMIT 2019-03-04 07:42:23.165514 EST;... snapshot 2608 relcache 16384

Standby

 50/ 50

   0

0/01C43220

0/01C43088

RUNNING_XACTS nextXid 1217 latestCompletedXid 1216 oldestRunningXid 1217

Step 7:

SELECT pg_current_wal_lsn();

 pg_current_wal_lsn 

--------------------

 0/1C43258

(1 row)

Step 8:

INSERT INTO abc VALUES('pkn');

Step 9:

./pg_waldump --path=/tmp/sd/pg_wal --start=0/1C43258

using the start LSN from step 7.

1663 → pg_default tablespace → noted in step 2

15709 → database postgres → noted in step 1

16384 → table abc → noted in step 5

rmgr

Len (rec/tot)

tx

lsn

prev

desc

Heap

59/59

1217

0/01C43258

0/01C43220

INSERT+INIT off 1, blkref #0: rel 1663/15709/16384 blk 0

Transaction

34/34

1217

0/01C43298

0/01C43258

COMMIT 2019-03-04 07:43:45.887511 EST

Standby

54/54

0

0/01C432C0

0/01C43298

RUNNING_XACTS nextXid 1218 latestCompletedXid 1216 oldestRunningXid 1217; 1 xacts: 1217

Step 10:

SELECT pg_current_wal_lsn();

 pg_current_wal_lsn 

--------------------

 0/1C432F8

(1 row)

Step 11:

INSERT INTO abc VALUES('ujy');

Step 12:

./pg_waldump --path=/tmp/sd/pg_wal –start=0/1C432F8

Using the start LSN from step 10.

rmgr

Len (rec/tot)

tx

lsn

prev

desc

Heap       

59/59

1218

0/01C432F8

0/01C432C0

INSERT off 2, blkref #0: rel 1663/15709/16384 blk 0

Transaction

34/34

1218

0/01C43338

0/01C432F8

COMMIT 2019-03-04 07:44:25.449151 EST

Standby    

50/50

   0

0/01C43360

0/01C43338

RUNNING_XACTS nextXid 1219 latestCompletedXid 1218 oldestRunningXid 1219

Step 13: Check the actual tuples in the WAL segment files.

---------+---------------------------------------------------+----------------+

 Offset  | Hex Bytes                                         | ASCII chars    |

---------+---------------------------------------------------+----------------+

00000060 | 3b 00 00 00 c3 04 00 00  28 00 40 02 00 00 00 00  |;.......(.@.....|

00000070 | 00 0a 00 00 ec 28 75 6e  00 20 0a 00 7f 06 00 00  |.....(un. ......|

00000080 | 5d 3d 00 00 00 40 00 00  00 00 00 00 ff 03 01 00  |]=...@..........|

00000090 | 02 08 18 00 09 70 6b 6e  03 00 00 00 00 00 00 00  |.....pkn........|

000000a0 | 22 00 00 00 c3 04 00 00  60 00 40 02 00 00 00 00  |".......`.@.....|

000000b0 | 00 01 00 00 dd 4c 87 04  ff 08 e4 73 44 e7 41 26  |.....L.....sD.A&|

000000c0 | 02 00 00 00 00 00 00 00  32 00 00 00 00 00 00 00  |........2.......|

000000d0 | a0 00 40 02 00 00 00 00  10 08 00 00 9e 01 36 88  |..@...........6.|

000000e0 | ff 18 00 00 00 00 00 00  00 00 00 03 00 00 c4 04  |................|

000000f0 | 00 00 c4 04 00 00 c3 04  00 00 00 00 00 00 00 00  |................|



00000100 | 3b 00 00 00 c4 04 00 00  c8 00 40 02 00 00 00 00  |;.........@.....|

00000110 | 00 0a 00 00 33 df b4 71  00 20 0a 00 7f 06 00 00  |....3..q. ......|

00000120 | 5d 3d 00 00 00 40 00 00  00 00 00 00 ff 03 01 00  |]=...@..........|

00000130 | 02 08 18 00 09 75 6a 79  04 00 00 00 00 00 00 00  |.....ujy........|

00000140 | 22 00 00 00 c4 04 00 00  00 01 40 02 00 00 00 00  |".........@.....|

00000150 | 00 01 00 00 96 2e 96 a6  ff 08 d8 f3 79 ed 41 26  |............y.A&|

00000160 | 02 00 00 00 00 00 00 00  32 00 00 00 00 00 00 00  |........2.......|

00000170 | 40 01 40 02 00 00 00 00  10 08 00 00 eb 6b 95 36  |@.@..........k.6|

00000180 | ff 18 00 00 00 00 00 00  00 00 00 03 00 00 c5 04  |................|

00000190 | 00 00 c5 04 00 00 c4 04  00 00 00 00 00 00 00 00  |................|

000001a0 | 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

 

14. What Are The Replication Options Based on WAL in PostgreSQL?

  • Continuous WAL Archiving
  • Log Shipping Based Replication — File Level
  • Log Shipping Based Replication — Block Level

14.1 Continuous WAL Archiving

Copying WAL files as they are generated into any location other than the pg_wal subdirectory for the purpose of archiving them is called WAL archiving. PostgreSQL will invoke a script provided by the user for archiving each time a WAL file is generated. The script can use the scp command to copy the file to one or more locations. The location can be an NFS mount. Once archived the WAL segment files can be used to recover the database at any specified point in time.

14.2 Log Shipping Based Replication — File Level

Copying log files to another PostgreSQL server for the purpose of creating another standby server by replaying WAL files is called Log Shipping. This standby server is configured to be in recovery mode, and its sole purpose is to apply any new WAL files as they arrive. This second server (also termed as standby) then becomes a warm backup of the primary PostgreSQL server. The standby can also be configured to be a read replica, where it can also serve read-only queries. This is called a hot standby.

14.3 Log Shipping Based Replication — Block Level

Streaming replication improves the log shipping process. Instead of waiting for the WAL switch, the records are sent as they are generated, thus reducing replication delay. Another improvement is that the standby server will connect to the primary server over the network using a replication protocol. The primary server can then send WAL records directly over this connection without having to rely on scripts provided by the end user.

14.4 How long should the primary server retain WAL segment files?

If there aren’t any streaming replication clients, the server can discard/recycle the WAL segment file once the archive script reports success, if they are not required for crash recovery.

The presence of standby clients, though, creates a problem: the server needs to keep WAL files around for as long as the slowest standby needs them. If, for example, the standby was taken down for a while, then comes back online and asks the primary for a WAL file that the primary no longer has, the replication fails with an error similar to:

ERROR: requested WAL segment 00000001000000010000002D has already been removed

The primary server should therefore keep track of how far behind the standby is, and not delete/recycle WAL files that any standbys still need. This feature is provided through replication slots.

Each replication slot has a name which is used to identify the slot. Each slot is associated with:

  1. The oldest WAL segment file required by the consumer of the slot. WAL segment files more recent than this are not deleted/recycled during checkpoints.
  2. The oldest transaction ID required to be retained by the consumer of the slot. Rows needed by any transactions more recent than this are not deleted by vacuum.

 

15. Log Shipping Based Replication

15.1 Physical Streaming Replication

Physical Streaming Replication in PostgreSQL

Physical Streaming Replication in PostgreSQL database is the data replication based on WAL. In streaming replication the standby server connects to the primary server and receives WAL records using a replication protocol. 

What are the advantages of physical streaming replication? 

  • The standby server does not need to wait for the WAL file to fill up, which improves replication lag.
  • The dependency on the user-provided script and an intermediate shared storage between the servers is removed.

 

16. What Are WAL Sender and WAL Receiver in PostgreSQL?

A process called WAL receiver, running on the standby server, uses the connection details provided in the primary_conninfo parameter of recovery.conf and connects to the primary server using a TCP/IP connection. 

WAL sender is another process running on the primary server and is in charge of sending the WAL records to the standby server as they are generated. The WAL receiver saves the WAL records in WAL as if they were generated by client activity of locally connected clients. Once the WAL records reach the WAL segment files the standby server constantly keeps replaying the WAL so that standby and primary are up to date.

16.1 WAL Streaming Protocol Details

WAL Streaming Protocol Details in PostgreSQL

 

17. PostgreSQL Replication and Failover Over Setup

The setup consists of two CentOS 7 machines connected via LAN on which PostgreSQL version 10.7 is installed.

17.1 Configuring PostgreSQL Replication using WAL Streaming

Step 1: Disable and stop firewall on both the machines:

sudo firewall-cmd --state

sudo systemctl stop firewalld

sudo systemctl disable firewalld

sudo systemctl mask --now firewalld

Step 2: On the primary server, allow replication connections and connections from the same network. Modify pg_hba.conf:

Local  all           all                       md5

host   all           all   172.16.214.167/24   md5

host   all           all   ::1/128             md5

local  replication   all                       md5

host   replication   all   172.16.214.167/24   md5

host   replication   all   ::1/128             md5

Step 3: On the primary server, edit postgresql.conf to modify the following parameters:

max_wal_senders = 10

wal_level = replica

max_replication_slots = 10

synchronous_commit = on

synchronous_standby_names = '*'

listen_addresses = '*'

Step 4: Start the primary server: 

./postgres -D ../pr_data -p 5432

Step 5: Take base backup to bootstrap the standby server: 

./pg_basebackup 

--pgdata=/tmp/sb_data/ 

--format=p 

--write-recovery-conf 

--checkpoint=fast 

--label=mffb 

--progress 

--verbose 

--host=172.16.214.167 

--port=5432 

--username=postgres

Step 6: Check the base backup label file:

START WAL LOCATION: 0/2000028 (file 000000010000000000000002)

CHECKPOINT LOCATION: 0/2000060

BACKUP METHOD: streamed

BACKUP FROM: master

START TIME: 2019-02-24 05:25:30 EST

LABEL: mffb

Step 7: In the base backup, add the following line in the recovery.conf:

primary_slot_name = 'node_a_slot'

Step 8: Check the /tmp/sb_data/recovery.conf file

standby_mode = 'on'

primary_conninfo = 'user=enterprisedb 

password=abc123 

host=172.16.214.167 

port=5432 

sslmode=prefer 

sslcompression=1 

krbsrvname=postgres 

target_session_attrs=any'

primary_slot_name = 'node_a_slot'

Step 9: Connect to the primary server and issue this command:

edb=# SELECT * FROM pg_create_physical_replication_slot('node_a_slot');

      slot_name  | xlog_position 

    -------------+---------------

     node_a_slot | 

    (1 row)



edb=# SELECT slot_name, slot_type, active FROM pg_replication_slots;

      slot_name  | slot_type | active 

    -------------+-----------+--------

     node_a_slot | physical  | f

    (1 row)

Step 10: Transfer the base backup to the standby server: 

scp /tmp/sb_data.tar.gz abbas@172.16.214.166:/tmp

sudo mv /tmp/sb_data /opt/PostgreSQL/10/

sudo chown postgres:postgres /opt/PostgreSQL/10/sb_data/

sudo chown -R postgres:postgres /opt/PostgreSQL/10/sb_data/

sudo chmod 700 /opt/PostgreSQL/10/sb_data/

Step 11: Start the standby server:

./postgres -D ../sb_data/ -p 5432

 

The primary will show this in the log:

LOG:  standby "walreceiver" is now a synchronous standby with priority 1

 

The standby will show:

LOG:  database system was interrupted; last known up at 2018-10-24 15:49:55

LOG:  entering standby mode

LOG:  redo starts at 0/3000028

LOG:  consistent recovery state reached at 0/30000F8

LOG:  started streaming WAL from primary at 0/4000000 on timeline 1

 

Step 12: Connect to the primary server and issue some simple commands:

-bash-4.2$ ./edb-psql -p 5432 edb

Password: 

psql.bin (10.7)

Type "help" for help.



create table abc(a int, b varchar(250));

insert into abc values(1,'One');

insert into abc values(2,'Two');

insert into abc values(3,'Three');

Step 13: Check the data on the replica:

./psql -p 5432 -U postgres postgres

Password for user postgres: 

psql.bin (10.7)

Type "help" for help.

postgres=# select * from abc;

a |   b   

---+-------

1 | One

2 | Two

3 | Three

(3 rows)

 

18. What are the PostgreSQL Manual Failover Steps?

Step 1: Crash the primary server.

Step 2: Promote the standby server by running the following command on the standby server:

./pg_ctl promote -D ../sb_data/

server promoting

Step 3: Connect to the promoted standby server and insert a row:

-bash-4.2$ ./edb-psql -p 5432 edb

Password: 

psql.bin (10.7)

Type "help" for help.



edb=# insert into abc values(4,'Four');

The fact that this insert works fine means that the standby, which otherwise is a read-only server, has been promoted as the new primary server.

 

19. How to Automate Failover and Replication in PostgreSQL

Setting up automatic failover is easy using EDB Postgres Failover Manager (EFM).  After downloading and installing EFM on each master and standby node, you can create an EFM Cluster,  which consists of a Master node, one or more Standby nodes, and an optional Witness node that confirms assertions in case of failure. 

EFM continuously monitors system health and sends email alerts based on system events. When a failure occurs,  it automatically switches over to the most up-to-date standby, and reconfigures all other standby servers to recognize the new master. It also reconfigures load balancers (such as pgPool) and prevents “split brain” (when two nodes each think they are primary) from occurring.

 

20. repmgr for PostgreSQL

Another open source tool is repmgr (Replication Manager),  which also manages replication and failover for PostgreSQL  clusters. EDB  has provided  an in-depth tutorial for installing and running repmgr for  PostgreSQL. When correctly configured, repmgr can detect  when the  master server  fails  and perform automatic failover.

 

Related Tutorials

How to implement repmgr for PostgreSQL automatic failover

Ranjeet Dhumal · March 23, 2020

This article discusses the open source tool repmgr (Replication Manager) and how to set up and configure it for automatic failover in PostgreSQL.

Logical Replication in PostgreSQL Explained

Kuntal Ghosh · December 4, 2019

This post discusses logical replication in PostgreSQL. It reviews the differences between physical or binary replication and logical or transactional replication.

How to manage Replication and Failover in Postgres Version 12 without recovery.conf file

Tushar Ahuja · November 6, 2019

With PostgreSQL v12,  “recovery.conf” is no longer valid. Even if someone were to create a recovery.conf file manually and keep it under the data directory, the server is not going to start and will throw the following error

Scalable Replication for Postgres

Zahid Iqbal · September 11, 2019

Scalable Replication Tools Scalable replication tools are a key component for Postgres data integration and migration. Compatible with data from...

Switchover/Failover and Session Migration

Bruce Momjian · October 5, 2018

I have already covered switchover and failover. In the case of failover, the old primary is offline so there are no options for migrating clients from the old primary to the new primary. If there is a switchover, there are options for client migration. Assuming you are using streaming replication, only one server can accept writes

Cheat Sheet: Configuring Streaming Postgres Synchronous Replication

Thomas Munro · May 10, 2017

Streaming replication in PostgreSQL can be asynchronous or synchronous. The Postgres synchronous replication option provides greater data protection in case of disaster, or if a server or data center goes down

Monitoring approach for Streaming Replication with Hot Standby in PostgreSQL 9.3.

Vibhor Kumar · May 21, 2014

.inline-cta { margin: 11px 0; } .inline-cta.cta-tall { margin: 30px 0; } .inline-cta b { font-size: 18px; } The people using PostgreSQL and the...

How to determine Single Master XDB replication lag

Richard Yen · October 27, 2017

I recently had a few customers ask me about using a query for monitoring SMR XDB lag, so they can keep tabs on replication progress and set up notifications.

Session State Failover

Bruce Momjian · October 23, 2017

On the server side, high availability means having the ability to quickly failover to standby hardware, hopefully with no data loss. Failover behavior...

Benchmarks Reveal Major Increase in EDB Replication Performance

Dick Dowdell · March 10, 2017

A fundamental question database administrators ask before deploying replication services is what they can expect in terms of performance. As described...

Near-Zero Downtime with New Postgres Failover Manager

Jason Davis · November 16, 2016

A main pillar of the EDB Postgres™ Platform from EnterpriseDB™ (EDB) is the Management Suite, which provides capabilities for disaster recovery, high...

EDB Replication Server is First to Adopt Faster WAL-based Postgres Replication

Zahid Iqbal · May 10, 2016

Data replication has become a more strategic operation as datacenters expand in complexity and geographically. Data is increasingly off-loaded from...

Tool Time: Replication and Failover

Jason Davis · June 3, 2014

With more than 2,400 customers, EnterpriseDB is well positioned to learn a great deal from Postgres users and gain deep insight into their needs and...

Running EDB Postgres Failover Manager Without Sudo

Bobby Bissett · April 24, 2018

In the default mode of operation, EDB Postgres Failover Manager (EFM) agents run as the ‘efm’ operating system user. The agent uses sudo to perform...

What Happens When I Press Enter? -- EDB failover manager Switchover

Richard Yen · August 18, 2017

nterpriseDB Failover Manager (EFM) is a great tool to automate failover/switchover if you use Postgres' streaming replication feature. Not only do you get High Availability (HA), you can do so with just a few simple commands to make it all happen very quickly.

Videos: Learn to Install and Start EDB Postgres Failover Manager

Bobby Bissett · February 3, 2017

A cornerstone for ensuring ongoing data access during scheduled downtime or unexpected failures is a high availability design. The EDB Postgres™...

Easy Upgrade Tips for the New EDB Failover Manager

Bobby Bissett · November 16, 2016

High availability means keeping the enterprise’s critical data infrastructure running well with virtually no downtime, and ensuring the database...

Improve database resiliency with new JDBC Connectors and EDB Failover Manager

ahsan hadi · November 18, 2015

This blog was co-written by Jason Davis. EnterpriseDB is about to release an update to the Connectors for Postgres and EDB Postgres. The update will...

EDB Failover Manager 2.0 Improves Productivity of DBAs and the Availability of their Data Infrastructure

Pierre Fricke · June 17, 2015

Database Administrators wrestle with a range of issues to keep the enterprise’s critical data infrastructure running well with virtually no downtime...

Share this

Relevant Blogs

Logical Replication in PostgreSQL Explained

.summary{ background:#f3f7f9; padding:20px; } SUMMARY: This post discusses logical replication in PostgreSQL. It reviews the differences between physical or binary replication and logical or transactional replication. It then describes these components...
January 24, 2023

More Blogs

How does PostgreSQL master-replica failover work?

.summary{ background:#f3f7f9; padding:20px; } This article looks at the processes involved in master-replica failover in PostgreSQL and the options and commands that are available for configuring it. 1. Recover process...
January 19, 2023