BDR Support v1

LiveCompare can be used against BDR nodes, as well as non-BDR nodes.

Setting logical_replication_mode = bdr will make the tool assume that all databases being compared belong to the same BDR cluster. Then you can specify node names as connections, and replication sets to filter tables.

For example, consider you are able to connect to any node in the BDR cluster. Let's call this Initial Connection. By initially connection to this node, LiveCompare is able to check BDR metadata and retrieve connection information from all other nodes.

Now consider you want to compare 3 BDR nodes. As LiveCompare is able to connect to any node starting from the Initial Connection, you do not need to define dsn or any connection information for the data connections. You just need to define node_name. LiveCompare searches in BDR metadata about the connection information for that node, and then connects to the node.

Please note that, for LiveCompare to be able to connect to all other nodes by fetching BDR metadata, it is required that LiveCompare is able to connect to them using the same DSN from BDR view bdr.node_summary, field interface_connstr. In this case it is recommended to run LiveCompare on the same machine as the Initial Connection, as postgres user. If that's not possible, then please define the dsn attribute in all data connections.

You can also specify replication sets as table filters. LiveCompare will use BDR metadata to build the table list, considering only tables that belong to the replication set(s) you defined in the replication_sets setting.

For example, you can create an .ini file to compare 3 BDR nodes:

[General Settings]
logical_replication_mode = bdr
max_parallel_workers = 4
parallel_chunk_rows = 1000000

[Initial Connection]
dsn = port=5432 dbname=live user=postgres

[Node1 Connection]
node_name = node1

[Node2 Connection]
node_name = node2

[Node3 Connection]
node_name = node3

[Output Connection]
dsn = port=5432 dbname=liveoutput user=postgres

[Table Filter]
replication_sets = set_name = 'bdrgroup'

It is also possible to tell LiveCompare to compare all active nodes in the BDR cluster. For that purpose just do the following:

  • In General Settings, enable all_bdr_nodes = on;
  • Specify an Initial Connection;
  • Additional data connections are not required.

For example:

[General Settings]
logical_replication_mode = bdr
max_parallel_workers = 4
parallel_chunk_rows = 1000000
all_bdr_nodes = on

[Initial Connection]
dsn = port=5432 dbname=live user=postgres

[Output Connection]
dsn = port=5432 dbname=liveoutput user=postgres

[Table Filter]
replication_sets = set_name = 'bdrgroup'

When all_bdr_nodes = on, LiveCompare uses the Initial Connection to fetch the list of all BDR nodes. Additional data connections are not required; although if set, will be appended to the list of data connections. For example, it would be possible to compare a whole BDR cluster against a single Postgres connection, useful in migration projects:

[General Settings]
logical_replication_mode = bdr
max_parallel_workers = 4
parallel_chunk_rows = 1000000
all_bdr_nodes = on

[Initial Connection]
dsn = port=5432 dbname=live user=postgres

[Old Connection]
dsn = host=oldpg port=5432 dbname=live user=postgres

[Output Connection]
dsn = port=5432 dbname=liveoutput user=postgres

[Table Filter]
replication_sets = set_name = 'bdrgroup'

Settings node_name and replication_sets are supported for the following technologies:

  • BDR 1, 2 and 3;
  • pglogical 2 and 3.

Please note that to enable pglogical metadata fetch instead of BDR, just set logical_replication_mode = pglogical instead of logical_replication_mode = bdr.

BDR Witness nodes

Using replication sets in BDR, it's possible to configure specific tables to be included in the BDR replication, and also specify which nodes should receive data from such tables, by configuring the node to subscribe to the replication set the table belongs to. This allows for different architectures such as BDR Sharding and the use of BDR Witness nodes.

A BDR Witness is a regular BDR node which doesn't replicate any DML from other nodes. The purpose of the Witness is to provide quorum in Raft Consensus voting (for more details on the BDR Witness node, check BDR documentation). Depending on how replication sets were configured, the Witness may or may not replicate DDL. Which means that there are 2 types of BDR Witnesses:

  • A completely empty node, without any data nor tables; or
  • A node that replicates DDL from other nodes, hence having empty tables.

In the first case, even if the BDR Witness is included in the comparison (either manually under [Connections] or using all_bdr_nodes = on), as the Witness doesn't have any tables, the following message will be logged:

Table public.tbl does not exist on connection node1

In the second case, on the other hand, the table exists on the BDR Witness. However, it would not be correct to report data missing on the Witness as divergences. So, for each table, LiveCompare checks the following information on each node included in the comparison:

  • The replication sets that the node subscribes;
  • The replication sets that the table is associated with;
  • The replication sets, if any, the user defined in filter replication_sets under Table Filter.

If the intersection among all 3 lists of replication sets is empty, which is the case for the BDR Witness, then LiveCompare will log this:

Table public.tbl is not subscribed on connection node1

In both cases, the comparison for that specific table proceeds on the nodes where the table exists, and the table is replicated according to the replication sets configuration.

Differences in a BDR cluster

LiveCompare will make changes to the local node only; it is important that corrective changes do not get replicated to other nodes.

When logical_replication_mode = bdr, LiveCompare will initially check if a replication origin called bdr_local_only_origin already exists (the name of the replication origin can be configured by adjusting the setting difference_fix_replication_origin). If a replication origin called bdr_local_only_origin does not exist yet, then LiveCompare creates it on all BDR connections.

IMPORTANT: Please note that BDR 3.6.18 introduced the new pre-created bdr_local_only_origin replication origin to be used for applying local-only transactions. So if LiveCompare is connected to BDR 3.6.18, it won't create this replication origin.

LiveCompare will generate apply scripts considering the following:

  • Set the current transaction to use the replication origin bdr_local_only_origin, so any DML executed will have xmin associated to bdr_local_only_origin;
  • Set the current transaction datetime to be far in the past, so if there are any BDR conflicts with real DML being executed on the database, LiveCompare DML always loses the conflict.

After applying LiveCompare fix script to a BDR node, it will be possible to get exactly which rows were inserted or updated by LiveCompare using the following query (replace mytable with the name of any table):

with lc_origin as (
    select roident
    from pg_replication_origin
    where roname = 'bdr_local_only_origin'
)
select t.*
from mytable t
inner join lc_origin r
on r.roident = bdr.pg_xact_origin(t.xmin);

(Note that deleted rows are no longer visible.)

Please note that LiveCompare requires at least a PostgreSQL user with bdr_superuser privileges in order to properly fetch metadata.

All steps above involving replication origins only applied to output script, if the PostgreSQL user has bdr_superuser or PostgreSQL superuser privileges. Otherwise, LiveCompare will generate fixes without associating any replication origin (transaction replication is still disabled using SET LOCAL bdr.xact_replication = off). However, it is recommended to use a replication origin when applying the DML scripts, because otherwise LiveCompare will have the same precedence as a regular user application regarding conflict resolution. Also, as there will not be any replication origin associated to the fix, the query above to list all rows fixed by LiveCompare can not be used.

Between BDR 3.6.18 and BDR 3.7.0, the following functions are used:

  • bdr.difference_fix_origin_create(): Executed by LiveCompare to create the replication origin specified in difference_fix_replication_origin (by default set to bdr_local_only_origin), if this replication origin does not exist;
  • bdr.difference_fix_session_setup(): Included in the generated DML script so the transaction is associated with the replication origin specified in difference_fix_replication_origin;
  • bdr.difference_fix_xact_set_avoid_conflict(): Included in the generated DML script so the transaction is set far in the past (2010-01-01), so the fix transaction applied by LiveCompare always loses a conflict, if any.

The functions above require a bdr_superuser rather than a PostgreSQL superuser. Starting from BDR 3.7.0, those functions are deprecated. LiveCompare then will, if running as a PostgreSQL superuser, use the following functions instead, to perform the same actions as above:

  • pg_replication_origin_create(origin_name);
  • pg_replication_origin_session_setup();
  • pg_replication_origin_xact_setup().

If a PostgreSQL superuser is not being used, then LiveCompare will include only the following in the generated DML transaction:

SET LOCAL bdr.xact_replication = off;

Conflicts in BDR

LiveCompare has an execution mode called conflicts. This execution mode is specific for BDR clusters. It will only work in BDR 3.6 or BDR 3.7 clusters.

While compare mode is used to compare all content of tables as a whole, conflicts mode will focus just in tuples/tables that are related to existing conflicts that are registered in bdr.apply_log, in case of BDR 3.6, or in bdr.conflict_history, in case of BDR 3.7.

Having said that, conflicts execution mode is expected to run much faster than compare mode, because it will just inspect specific tuples from specific tables. At the same time, it's not as complete as compare mode, because of the same reason.

The main objective of this execution mode is to check that the automatic conflict resolution which is being done by BDR is consistent among nodes, i.e., after BDR resolving conflicts the cluster is in a consistent state.

Although, for the general use case, automatic conflict resolution ensures cluster consistency, there are a few known cases where automatic conflict resolution can result in divergent tuples among nodes. So the conflicts execution mode from LiveCompare can help checking and ensuring consistency, with a good balance between time vs result.

Conflict example

Imagine on node3 we execute the following query:

SELECT c.reloid::regclass,
       s.origin_name,
       c.local_time,
       c.key_tuple,
       c.local_tuple,
       c.remote_tuple,
       c.apply_tuple,
       c.conflict_type,
       c.conflict_resolution
FROM bdr.conflict_history c
INNER JOIN bdr.subscription_summary s
ON s.sub_id = c.sub_id;

We can see the following conflict in bdr.conflict_history:

reloid              | tbl
origin_name         | node2
local_time          | 2021-05-13 19:17:43.239744+00
key_tuple           | {"a":null,"b":3,"c":null}
local_tuple         |
remote_tuple        |
apply_tuple         |
conflict_type       | delete_missing
conflict_resolution | skip

Which means that when the DELETE arrived from node2 to node3, there was no row with b = 3 in table tbl. However, the INSERT might have arrived from node1 to node3 later, which then added the row with b = 3 to node3. So this is the current situation on node3:

bdrdb=# SELECT * FROM tbl WHERE b = 3;
 a | b |  c
---+---+-----
 x | 3 | foo
(1 row)

While on nodes node1 and node2, this is what we see:

bdrdb=# SELECT * FROM tbl WHERE b = 3;
 a | b | c
---+---+---
(0 rows)

The BDR cluster is divergent.

Now in order to detect and fix such divergence, we could execute LiveCompare in compare mode, but depending on the size of the comparison set (imagine table tbl is very large), that can take a long time, even hours.

This is exactly the situation where conflicts mode can be helpful. In this case, the delete_missing conflict is visible only from node3, but LiveCompare is able to extract the PK values from the conflict logged rows (key_tuple, local_tuple, remote_tuple and apply_tuple) and perform an automatic cluster-wide comparison only on the affected table, already filtering by the PK values. The comparison will then check the current row version in all nodes in the cluster.

So we create a check.ini file to set all_bdr_nodes = on, i.e., to tell LiveCompare to compare all nodes in the cluster:

[General Settings]
logical_replication_mode = bdr
max_parallel_workers = 2
all_bdr_nodes = on

[Initial Connection]
dsn = dbname=bdrdb

[Output Connection]
dsn = dbname=liveoutput

To run LiveCompare in conflicts mode:

2ndq-livecompare check.ini --conflicts

After the execution, in the console output, you will see something like this:

Elapsed time: 0:00:02.443557
Processed 1 conflicts about 1 tables from 3 connections using 2 workers.
Found 1 divergent conflicts in 1 tables.
Processed 1 rows in 1 tables from 3 connections using 2 workers.
Found 1 inconsistent rows in 1 tables.

Inside folder ./lc_session_X/ (being X the number of the current comparison session), LiveCompare will write the file conflicts_DAY.out (replacing DAY in the name of the file with the current day), showing the main information about all divergent conflicts.

If you connect to database liveoutput, you will be able to see more details about the conflicts, for example using this query:

SELECT *
FROM livecompare.vw_conflicts
WHERE session_id = 1
  AND conflict_id = 1
ORDER BY table_name,
         local_time,
         target_node;

You will see something like this:

session_id             | 1
table_name             | public.tbl
conflict_id            | 1
connection_id          | node3
origin_node            | node2
target_node            | node3
local_time             | 2021-05-13 19:17:43.239744+00
key_tuple              | {"a": null, "b": 3, "c": null}
local_tuple            |
remote_tuple           |
apply_tuple            |
conflict_type          | delete_missing
conflict_resolution    | skip
conflict_pk_value_list | {(3)}
difference_log_id_list | {1}
is_conflict_divergent  | t

The is_conflict_divergent = true means that LiveCompare has compared the conflict and found the nodes to be currently divergent in the tables and rows reported by the conflict. View livecompare.vw_conflicts shows information about all conflicts, including the non-divergent ones.

LiveCompare will also automatically generate DML script ./lc_session_X/apply_on_the_node3_DAY.sql (replacing DAY in the name of the file with the current day):

BEGIN;

SET LOCAL bdr.xact_replication = off;
SELECT pg_replication_origin_session_setup('bdr_local_only_origin');
SELECT pg_replication_origin_xact_setup('0/0', '2010-01-01'::timestamptz);;

SET LOCAL ROLE postgres;
DELETE FROM public.tbl WHERE (b) = (3);

COMMIT;

LiveCompare is suggesting to DELETE the row where b = 3 from node3, because on the other 2 nodes the row does not exist. By default, LiveCompare suggest the DML to fix based on the majority of the nodes.

If you run this DML script against node3:

psql -h node3 -f ./lc_session_X/apply_on_the_node3_DAY.sql

You will get the BDR cluster consistent again.

As the --conflicts mode comparison is much faster than a full --compare, it is highly recommended to schedule a --conflicts comparison session more often, to ensure conflict resolution is providing cluster-wide consistency.

Please note that, in order to be able to see the data in bdr.conflict_history in BDR 3.7 or bdr.apply_log in BDR 3.6, you should run LiveCompare with an user that is bdr_superuser or is a PostgreSQL superuser.

Mixing technologies

Please note that metadata for node_name and replication_sets are fetched in the Initial Connection. So it should be a pglogical- and/or BDR-enabled database.

The list of tables is built in the first data connection. So the replication_sets condition should be valid in the first connection.

It is possible to perform mixed technology comparisons, for example:

  • BDR 1 node versus BDR 3 node;
  • BDR 3 node versus vanilla Postgres instance;
  • Vanilla Postgres instance versus pglogical node.