Kuntal Ghosh Senior Software Engineer Dec 4, 2019
SUMMARY: This post discusses logical replication in PostgreSQL. It reviews the differences between physical or binary replication and logical or transactional replication. It then describes these components of logical replication:
2. Basic syntax
3. An example
Postgres has introduced physical (binary) replication starting in PostgreSQL 9.0. In physical replication, each change in the master is streamed through the WAL (write-ahead logging) and applied in the standby/destination server. However, certain things are not possible while using the physical replication:
- It cannot perform selective replication or part of the database cannot be replicated.
- It cannot replicate between two different major versions.
- It cannot perform any writes in the standby server.
- It cannot replicate between different platforms (for example, Linux and Windows)
PostgreSQL 10 implements logical replication, which takes care of the above limitations of physical replication and opens up the possibility of promising new areas of replication. In logical replication, also known as transactional replication, the subscriber initially receives a copy of the replicated database object from the publisher and pulls any subsequent changes on the same object as they occur in real-time.
Logical replication follows a publish and subscribe model. In a publisher node, a publication is created, which is a set of changes from a table or a group of tables. In a subscriber node, a subscription is created, which can subscribe to one or more publications.
The replication starts by copying a snapshot of the publishing database to the subscriber. This is also known as the table synchronization phase. Multiple table synchronization workers can be spawned to reduce the time spent in this phase. However, there can be only one synchronization worker per table. Once the copy is done, the subsequent changes in the publisher node are sent to the subscriber node as they occur in real-time. The changes are applied in the subscriber in commit order, which helps maintain the transactional consistency.
For a minimal logical replication setup, the only configuration parameter that needs to be changed is wal_level = logical. This instructs the server to store additional information in the WAL to convert binary changes to logical ones. Once this is done, we can create a publication using the following command:
CREATE PUBLICATION my_pub FOR ALL TABLES;
Instead of publishing all the tables, we can also specify a set of tables to publish. Not only that, a publication can choose to limit the changes they publish to any combination of INSERT, UPDATE, DELETE, and TRUNCATE operations. But, in order to be able to replicate UPDATE and DELETE operations, a “published” table must have a REPLICA IDENTITY so that the modified rows can be identified in the subscriber. INSERT operations can proceed regardless of any replica identity.
See the CREATE PUBLICATION documentation for more details.
Once the publications are created, we can create subscriptions in the subscriber node:
CREATE SUBSCRIPTION my_sub CONNECTION '... <connection string> ...' PUBLICATION my_pub;
This adds a new subscription my_sub for the current database that will start receiving the logical changes from the publication my_pub. When the command is run, a logical replication worker is spawned and it receives the logical changes from the publisher. On the publisher side, a walsender process is spawned, which is responsible for reading the WAL one-by-one, decode the changes, and send those changes to the respective subscriber.
See the CREATE SUBSCRIPTION documentation for more details.
Some important points should be noted before using logical replication in versions prior to PostgreSQL 12:
1. Each subscriber can subscribe to multiple publications, and each publication can publish changes to multiple subscribers.
2. To add or remove tables from an existing publication, the ALTER PUBLICATION command can be used.
3. The database schema and DDL definitions cannot be replicated to the subscriber yet. The published tables must exist on the subscriber.
4. The replicated table has to be a regular table — not views, materialized views, partition root tables, or foreign tables.
5. The table should have the same full qualified name in publisher and subscriber.
6. The column names must match, but the order of the columns in the subscriber table doesn't matter. Additionally, there can be the same or more number of columns in a subscribed table.
7. Replication of sequence data and large objects are not yet supported.
Let's set up logical replication for a table. First, we're going to create a simple table and insert a row:
postgres=# create table t1(a int primary key, b int); CREATE TABLE postgres=# insert into t1 values(1, 1); INSERT 0 1
Next, we're going to set up a publication for the same table:
postgres=# create publication my_pub for table t1; CREATE PUBLICATION
Once the publication is created, it's time to create a subscription. Please note that we've created the publisher and subscriber in the same host, but in different ports, 54321 and 54322 respectively.
postgres=# CREATE TABLE t1(a int primary key, b int); CREATE TABLE postgres=# CREATE SUBSCRIPTION my_sub CONNECTION 'host=localhost port=54321 dbname=postgres' PUBLICATION my_pub; NOTICE: created replication slot "my_sub" on publisher CREATE SUBSCRIPTION
Now, verify the existing row has been replicated:
postgres=# SELECT * FROM t1; a | b ---+--- 1 | 1 (1 row)
Let's verify whether the ongoing changes are getting replicated. To do this, we insert another row in the publisher and we check the same in the subscriber:
postgres=# insert into t1 values(2, 1); INSERT 0 1 postgres=# SELECT * FROM t1; a | b ---+--- 1 | 1 2 | 1 (2 rows)
Hope that helps! We're going to discuss more aspects of logical replication in future posts.