Correction strategies v7

Before you begin manual resolution correction, first determine the extent of the inconsistencies that occurred in the publication tables across the primary nodes of the replication system.

The Conflict History tab in the Replication Server console and the SQL query described in Finding conflicts can help determine the source of an initial conflict.

However, once this conflict occurs, your replication system might have processed and replicated more transactions during that synchronization operation. Some of these later replications might have succeeded, but others might have failed or produced unexpected results because of the prior conflict. With a replication schedule in effect, more synchronization operations can occur, which can create more conflicts.

Therefore, when you learn that a conflict occurred, we recommend that you stop the publication server. Use the stop option of the Linux scripts or Windows services described in Registering a publication server.

In this way, you can carefully analyze the content of the publication tables in question as well as any pending transactions in the shadow tables. This approach helps you to determine the best course of action without continued updates by the running replication system.

When analyzing your tables. you must determine the following:

  • The publication tables that contain inconsistent rows across primary nodes (that is, missing rows on some primary nodes or rows with different column values for the same primary key on different primary nodes).
  • Pending transactions in the shadow tables not applied to the publication tables across all primary nodes. Pending transactions are denoted by a value of P in the rrep_tx_conflict_status column of the shadow table.
  • Transactions on the publication tables that have occurred and are recorded in the shadow tables following the initial conflict. Also determine whether these transactions were applied completely and correctly to the publication tables across all primary nodes. These transactions might not be marked as pending. Instead, their rrep_tx_conflict_status column might be set to null, meaning that no conflict was detected during replication or the transaction wasn't replicated yet. You can identify these transactions because they have a later rrep_tx_timestamp value than the transactions causing the initial conflict.

The general steps to resolving the problem following this analysis are the following:

  1. Make the needed manual corrections to the rows in the publication tables across all primary nodes to get them into an initial, consistent state so each publication table has the same set of identical rows across primary nodes. This might be to a state before the conflicting transactions occurred, depending on what you determine to be the easiest course of action for fully resolving the conflict.

  2. Apply or reapply transactions (either from your application or from the shadow tables) so that all publication tables across all primary nodes are updated consistently according to the desired, expected result of what was recorded in the shadow tables.

  3. In the shadow tables, update certain indicators for conflicting entries to verify that they were resolved.

  4. In the control schema, update certain indicators for the conflicting entries to show that these conflicts were resolved. This update changes the Resolution Status of these entries to Resolved in the Conflict History tab. These entries no longer appear in the SQL query described in Finding conflicts.

    Perform these updates to the control schema of the controller database. You can determine the currently designated controller database from the content of the Replication Server configuration file (see Replication Server configuration file). The publication server ensures that the control schema changes made on the controller database are replicated to the control schemas of all publication databases. This mechanism maintains metadata consistency across all publication databases.

  5. Resume operating your replication system. Start the publication server and recreate the replication schedule if you were using one.

To acheive the first two steps, use some combination of the following methods. The methods you use depend on the state of your publication tables and the extent of pending transactions that need to be applied from the shadow tables.

  • Manual publication table correction. Use a utility such as PSQL or pgAdmin (Postgres Enterprise Manager Client in Advanced Server) to manually correct the rows in the publication tables across all primary nodes without replicating these changes. Use the database user with session_replication_role set to replica for this purpose.
  • Correction using new transactions. Rerun your application on one primary node to create new transactions that can replicate to all other primary nodes. Use this method after you ensure that all publication tables are in a consistent state across all primary nodes.
  • Correction using shadow table transactions. Force the synchronization of transactions already recorded in the shadow tables. Use this method if many shadow table transactions need to be applied and it's simpler to force the synchronization of these transactions rather than reissuing the transactions from your application.

Example replication environment

In the examples that follow, the following replication environment is used.

  • A three-node multi-master replication system is established. The primary node names are MMRnode_a (the primary definition node and the controller database), MMRnode_b, and MMRnode_c.
  • The publication is named emp_pub and uses dept and emp tables.
  • The conflict used to show the first two conflict resolution methods is a uniqueness conflict occurring on the dept table on primary key column deptno on value 50. This conflict resulted from the INSERT statements shown by the following:

On MMRnode_a, the following statement is run:

INSERT INTO dept VALUES (50, 'FINANCE', 'CHICAGO');

On MMRnode_b, the following statement is run:

INSERT INTO dept VALUES (50, 'MARKETING', 'LOS ANGELES');

A synchronization replication is then performed.

The following is the content of table dept on MMRnode_a:

MMRnode_a=# SELECT * FROM dept;
Output
 deptno |   dname    |   loc
--------+------------+----------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON
     50 | FINANCE    | CHICAGO
(5 rows)

The following is the content of table dept on MMRnode_b:

MMRnode_b=# SELECT * FROM dept;
Output
 deptno |   dname    |     loc
--------+------------+-------------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON
     50 | MARKETING  | LOS ANGELES
(5 rows)

The following is the content of table dept on MMRnode_c:

MMRnode_c=# SELECT * FROM dept;
Output
 deptno |   dname    |   loc
--------+------------+----------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON
(4 rows)

The Conflict History tab shows the following entry:

Conflict History tab with a uniqueness conflict

The following is the output from the SQL query described in Finding conflicts.

-[ RECORD 1 ]-------+----------------------------------------------------------------
conflict_type       | II
table_name          | dept
pk_value            | deptno=50
src_db_host         | 192.168.2.22
src_db_port         | 5444
src_db_name         | MMRnode_a
src_rrep_sync_id    | 2
target_db_host      | 192.168.2.22
target_db_port      | 5444
target_db_name      | MMRnode_b
target_rrep_sync_id | 0
notes               | ERROR: duplicate key value violates unique constraint "dept_pk"
                    |   Detail: Key (deptno)=(50) already exists.
-[ RECORD 2 ]-------+----------------------------------------------------------------
conflict_type       | II
table_name          | dept
pk_value            | deptno=50
src_db_host         | 192.168.2.22
src_db_port         | 5444
src_db_name         | MMRnode_b
src_rrep_sync_id    | 1
target_db_host      | 192.168.2.22
target_db_port      | 5444
target_db_name      | MMRnode_a
target_rrep_sync_id | 0
notes               | ERROR: duplicate key value violates unique constraint "dept_pk"
                    |   Detail: Key (deptno)=(50) already exists.

Manual publication table correction

The first step in all manual conflict resolutions is to ensure all publication tables are consistent across all primary nodes. That is, all corresponding tables have the same rows with the same column values.

After that, you can then reapply transactions that failed to replicate successfully.

Using the example from Correction strategies, the inconsistencies are the following:

  • Primary nodes MMRnode_a and MMRnode_b each contain a row with primary key value 50, but the other column values in the row are different.
  • Primary node MMRnode_c doesn't have a row with primary key value 50.

Assuming that the correct state of the dept table is the one in MMRnode_b, the following options are available to correct the state of all primary nodes:

  • Manually correct the dept table in MMRnode_a and MMRnode_c. That is, update the row in MMRnode_a so it has the correct values, and insert the missing row in MMRnode_c. The dept table on all nodes is now consistent and up to date.
  • Manually delete the row with primary key value 50 from the table on both MMRnode_a and MMRnode_b. This brings the dept table on all primary nodes back to a prior, consistent state. Then, with the multi-master replication system running, perform the insert transaction again using the correct column values on any one of the primary nodes.
  • Manually delete the incorrect row with primary key value 50 from the table on MMRnode_a. Leave the correct row in the table in MMRnode_b. This simulates the state where the correct transaction was run on MMRnode_b, is recorded in the shadow table, but has not yet been replicated, and the incorrect transaction was never run on MMRnode_a. Update the shadow table entry in MMRnode_a to indicate that it is discarded and to ensure it isn't included in any future synchronizations. Update the metadata for the shadow table entry in MMRnode_b to force its inclusion in the next synchronization. Perform a synchronization replication so the accepted shadow table entry in MMRnode_b is replicated to MMRnode_a and MMRnode_c.

After the publication table rows are corrected, update the appropriate control schema table in the publication database currently designated as the controller database to indicate that the conflict has been resolved.

Step 1

Manually correct the rows in the publication tables with session_replication_role set to replica. On MMRnode_a, correct the erroneous row:

MMRnode_a=# SHOW session_replication_role;
Output
session_replication_role
--------------------------
replica
(1 row)
MMRnode_a=# SELECT * FROM dept;
Output
 deptno |   dname    |   loc
--------+------------+----------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON
     50 | FINANCE    | CHICAGO
(5 rows)
MMRnode_a=# UPDATE dept SET dname = 'MARKETING', loc = 'LOS ANGELES' WHERE deptno = 50;
UPDATE 1
MMRnode_a=# SELECT * FROM dept ORDER BY deptno;
Output
 deptno |   dname    |     loc
--------+------------+-------------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON
     50 | MARKETING  | LOS ANGELES
(5 rows)

On MMRnode_c, insert the missing row:

MMRnode_c=# SHOW session_replication_role;
Output
 session_replication_role
--------------------------
 replica
(1 row)
MMRnode_c=# INSERT INTO dept VALUES (50, 'MARKETING', 'LOS ANGELES');
INSERT 0 1
MMRnode_c=# SELECT * FROM dept ORDER BY deptno;
Output
 deptno |   dname    |     loc
--------+------------+-------------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON
     50 | MARKETING  | LOS ANGELES
(5 rows)

The dept table on MMRnode_a and MMRnode_c now match the content of the table on MMRnode_b:

MMRnode_b=# SELECT * FROM dept ORDER BY deptno;
Output
 deptno |   dname    |     loc
--------+------------+-------------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON
     50 | MARKETING  | LOS ANGELES
(5 rows)

Step 2

Update the shadow table entries for the conflicting transactions in the primary nodes to indicate that the conflict was resolved. In each primary node where a transaction occurred that's involved in the conflict, inspect the shadow table for the publication table in question. Shadow tables are located in each primary node in schema _edb_replicator_pub. Shadow tables follow the naming convention rrst_schema_table where schema is the name of the schema containing the publication table and table is the name of the publication table.

Note the following points regarding shadow tables:

  • A row in a shadow table corresponds to an INSERT, UPDATE, or DELETE statement that's applied to the corresponding publication tables in the other primary nodes. A shadow table row doesn't necessarily correspond to the SQL statement issued by the user application. For example, a SQL statement issued by a user application that includes a WHERE clause using a range such as greater than or less than results in multiple, individual entries in the shadow table for each row in the result set of the application’s SQL statement.
  • The primary key of a shadow table is a program-generated positive integer in column rrep_sync_id. The rrep_sync_id values are unique among all shadow tables in a given primary node. Therefore, the rrep_sync_id values for conflicting transactions might not have the same value across primary nodes. This depends on how many prior transactions were recorded in the shadow tables of each primary node.
  • A shadow table entry for a transaction involved in a conflict that wasn't yet resolved contains a value of P (pending) in column rrep_tx_conflict_status. If a transaction isn't involved in a conflict, this column is set to null. (Most shadow table entries have null in this column.) If a transaction was involved in a conflict that was resolved automatically by the publication server, and this transaction was accepted as correct, this column contains C (complete/accepted). If a transaction was involved in a conflict that was resolved automatically, and this transaction was deemed incorrect, this column contains D (discarded).

To find the shadow table entries involved in a conflict, use the Conflict History tab in the Replication Server console or the SQL query described in Finding conflicts and shown by the following output:

-[ RECORD 1 ]-------+----------------------------------------------------------------
conflict_type       | II
table_name          | dept
pk_value            | deptno=50
src_db_host         | 192.168.2.22
src_db_port         | 5444
src_db_name         | MMRnode_a
src_rrep_sync_id    | 2
target_db_host      | 192.168.2.22
target_db_port      | 5444
target_db_name      | MMRnode_b
target_rrep_sync_id | 0
notes               | ERROR: duplicate key value violates unique constraint "dept_pk"
                    |   Detail: Key (deptno)=(50) already exists.
-[ RECORD 2 ]-------+----------------------------------------------------------------
conflict_type       | II
table_name          | dept
pk_value            | deptno=50
src_db_host         | 192.168.2.22
src_db_port         | 5444
src_db_name         | MMRnode_b
src_rrep_sync_id    | 1
target_db_host      | 192.168.2.22
target_db_port      | 5444
target_db_name      | MMRnode_a
target_rrep_sync_id | 0
notes               | ERROR: duplicate key value violates unique constraint "dept_pk"
                    |   Detail: Key (deptno)=(50) already exists.

You can then query the shadow table in the desired primary node on its rrep_sync_id value.

The following query is performed on the shadow table for the dept table in MMRnode_a on rrep_sync_id value 2 obtained from field src_rrep_sync_id of RECORD 1 in the preceding output.

MMRnode_a=# SELECT * FROM _edb_replicator_pub.rrst_edb_dept WHERE rrep_sync_id = 2;
Output
-[ RECORD 1 ]-----------+--------------------------
rrep_sync_id            | 2
rrep_common_id          |
rrep_operation_type     | I
rrep_tx_timestamp       | 25-AUG-15 11:39:35.590648
deptno                  | 50
dname                   | FINANCE
loc                     | CHICAGO
rrep_old_deptno         |
rrep_old_dname          |
rrep_old_loc            |
rrep_tx_conflict_status | P

A similar query can locate the pending shadow table entry in MMRnode_b by querying on the key value obtained from field src_rep_sync_id: of RECORD 2:

MMRnode_b=# SELECT * FROM _edb_replicator_pub.rrst_edb_dept WHERE rrep_sync_id = 1;
Output
-[ RECORD 1 ]-----------+--------------------------
rrep_sync_id            | 1
rrep_common_id          |
rrep_operation_type     | I
rrep_tx_timestamp       | 25-AUG-15 11:39:57.980469
deptno                  | 50
dname                   | MARKETING
loc                     | LOS ANGELES
rrep_old_deptno         |
rrep_old_dname          |
rrep_old_loc            |
rrep_tx_conflict_status | P
Note

To be certain no pending transactions are overlooked, examine the shadow tables in all primary nodes that might be involved in the conflict, and search for entries where rrep_tx_conflict_status is set to P.

The following shows the rrep_tx_conflict_status column marked P (pending) in the Postgres Enterprise Manager Client.

Shadow table entry with pending conflict

Modify column rrep_tx_conflict_status by changing the value to D (discarded) to show that the pending conflict was resolved. A value of D also ensures that the shadow table entry isn't replicated during any future synchronization replications.

Make this change to the shadow tables in both MMRnode_a and MMRnode_b.

Discarded shadow table entry

Be sure to qualify the row with the correct rrep_sync_id value if you perform the update using a SQL statement such as in the following:

UPDATE _edb_replicator_pub.rrst_edb_dept SET rrep_tx_conflict_status = 'D' WHERE rrep_sync_id = 1;

There's no shadow table entry in MMRnode_c, since an insert transaction wasn't performed in that primary node by the application.

Step 3

In the control schema of the publication database currently designated as the controller database, modify the entries in the xdb_conflicts table to indicate the conflict was resolved. Table xdb_conflicts is located in schema _edb_replicator_pub.

Note

The entries in table xdb_conflicts affect only the data that appears in the Conflict History tab and the SQL query described in Finding conflicts. Changing entries in xdb_conflicts has no effect on future replication operations but provides a way to keep a record of how past conflicts were resolved.

Note the following points regarding the xdb_conflicts table:

  • A row in the xdb_conflicts table appears as an entry in the Conflict History tab.
  • The primary key of the xdb_conflicts table is made up of columns src_db_id, target_db_id, src_rrep_sync_id, and target_rrep_sync_id. Column src_db_id contains a unique identifier for the primary node in which a transaction occurred that results in a conflict when replicated to the primary node identified by target_db_id. src_rrep_sync_id is the shadow table identifier of the transaction on the source primary node involved in the conflict, while target_rrep_sync_id is the shadow table identifier of the transaction on the target primary node involved in the conflict. For uniqueness (insert/insert) conflicts, the target_rrep_sync_id value is always set to 0. For a given uniqueness conflict, there are two entries in the xdb_conflicts table. The src_rrep_sync_id value in each of the two entries corresponds to the shadow table identifiers. One is for the shadow table identifier associated with the source primary node. The other is for the shadow table identifier associated with the target primary node.
  • Table xdb_pub_database in the control schema associates the database identifiers src_db_id and target_db_id with the primary node attributes such as the database name, IP address, and port.
  • Column table_id is the identifier of the publication table on which the conflict occurred. Association of the table_id value with the publication table attributes such as its name, schema, and shadow table is found in each primary node in _edb_replicator_pub.rrep_tables.
  • For uniqueness (insert/insert) conflicts only, column pk_value contains text indicating the primary key value that resulted in the conflict. The text is formatted as column_name=value. If the primary key is made up of two or more columns, each column and value pair is separated by the keyword AND, such as column_1=value_1 AND column_2=value_2. This provides the primary key of the row in the publication table designated by table_id that resulted in the conflict. Only uniqueness (insert/insert) conflicts contain the column_name=value text in the pk_value column. The pk_value column is null for all other conflict types (that is, update/update, delete/update, update/delete, and delete/delete conflicts).
  • Column resolution_status indicates the status of the conflict. Possible values are P (pending) or C (completed, that is the conflict was resolved). This status appears in the Resolution Status column of the Conflict History tab.
  • You can use the column win_db_id to record the database identifier of the primary node that contains the winning (accepted) transaction. This information appears in the Winning DB column of the Conflict History tab.
  • You can use column win_rrep_sync_id to record the shadow table identifier of the winning transaction.

The following shows the Conflict History tab prior to updating the xdb_conflicts table.

Pending uniqueness conflict

You can find the conflict entry for synchronization from MMRnode_a to MMRnode_b in xdb_conflicts with the following query for this example:

MMRnode_a=# SELECT * FROM _edb_replicator_pub.xdb_conflicts
MMRnode_a-#   WHERE src_db_id = 1
MMRnode_a-#     AND target_db_id = 4
MMRnode_a-#     AND src_rrep_sync_id = 2
MMRnode_a-#     AND target_rrep_sync_id = 0;
Output
-[ RECORD 1 ]-------+----------------------------------------------------------------
src_db_id           | 1
target_db_id        | 4
src_rrep_sync_id    | 2
target_rrep_sync_id | 0
table_id            | 31
conflict_time       | 25-AUG-15 10:40:23.685738
resolution_status   | P
resolution_strategy |
resolution_time     |
alert_status        |
conflict_type       | II
win_db_id           | 0
win_rrep_sync_id    | 0
notes               | ERROR: duplicate key value violates unique constraint "dept_pk"
                    |   Detail: Key (deptno)=(50) already exists.
 pk_value            | deptno=50

You can find the conflict entry for synchronization from MMRnode_b to MMRnode_a in xdb_conflicts with the following query for this example:

MMRnode_a=# SELECT * FROM _edb_replicator_pub.xdb_conflicts
MMRnode_a-#   WHERE src_db_id = 4
MMRnode_a-#     AND target_db_id = 1
MMRnode_a-#     AND src_rrep_sync_id = 1
MMRnode_a-#     AND target_rrep_sync_id = 0;
Output
-[ RECORD 1 ]-------+----------------------------------------------------------------
src_db_id           | 4
target_db_id        | 1
src_rrep_sync_id    | 1
target_rrep_sync_id | 0
table_id            | 31
conflict_time       | 25-AUG-15 10:40:23.726889
resolution_status   | P
resolution_strategy |
resolution_time     |
alert_status        |
conflict_type       | II
win_db_id           | 0
win_rrep_sync_id    | 0
notes               | ERROR: duplicate key value violates unique constraint "dept_pk"
                   |   Detail: Key (deptno)=(50) already exists.
pk_value            | deptno=50

For uniqueness (insert/insert) conflicts only, you can use the following query to display both of the preceding entries:

MMRnode_a=# SELECT * FROM _edb_replicator_pub.xdb_conflicts
MMRnode_a-#   WHERE pk_value = 'deptno=50'
MMRnode_a-#     AND conflict_type = 'II'
MMRnode_a-#     AND resolution_status = 'P';
Output
-[ RECORD 1 ]-------+----------------------------------------------------------------
src_db_id           | 1
target_db_id        | 4
src_rrep_sync_id    | 2
target_rrep_sync_id | 0
table_id            | 31
conflict_time       | 25-AUG-15 10:40:23.685738
resolution_status   | P
resolution_strategy |
resolution_time     |
alert_status        |
conflict_type       | II
win_db_id           | 0
win_rrep_sync_id    | 0
notes               | ERROR: duplicate key value violates unique constraint "dept_pk"
                   |   Detail: Key (deptno)=(50) already exists.
pk_value            | deptno=50
-[ RECORD 2 ]-------+----------------------------------------------------------------
src_db_id           | 4
target_db_id        | 1
src_rrep_sync_id    | 1
target_rrep_sync_id | 0
   table_id            | 31
   conflict_time       | 25-AUG-15 10:40:23.726889
   resolution_status   | P
   resolution_strategy |
   resolution_time     |
   alert_status        |
   conflict_type       | II
   win_db_id           | 0
   win_rrep_sync_id    | 0
   notes               | ERROR: duplicate key value violates unique constraint "dept_pk"
                       |   Detail: Key (deptno)=(50) already exists.
   pk_value            | deptno=50

These entries appear in the Postgres Enterprise Manager Client.

Change the value in column resolution_status from P (pending) to C (completed) to indicate this conflict was resolved. The value in winning_db_id changes to 4 to indicate primary node MMRnode_b contains the winning transaction. The value in winning_rrep_sync_id changes to the value of rrep_sync_id for the shadow table entry of the transaction in MMRnode_b, since this is the one deemed correct.

The SQL statement to perform this update for the MMRnode_a to the MMRnode_b synchronization conflict is the following:

UPDATE _edb_replicator_pub.xdb_conflicts SET
  resolution_status = 'C',
  win_db_id = 4,
  win_rrep_sync_id = 1
WHERE src_db_id = 1
  AND target_db_id = 4
  AND src_rrep_sync_id = 2
  AND target_rrep_sync_id = 0;

The SQL statement to perform this update for the MMRnode_b to the MMRnode_a synchronization conflict is the following:

UPDATE _edb_replicator_pub.xdb_conflicts SET
  resolution_status = 'C',
  win_db_id = 4,
  win_rrep_sync_id = 1
WHERE src_db_id = 4
  AND target_db_id = 1
  AND src_rrep_sync_id = 1
  AND target_rrep_sync_id = 0;

For uniqueness (insert/insert) conflicts only, you can use the following SQL statement to update both of the preceding entries at the same time:

UPDATE _edb_replicator_pub.xdb_conflicts SET
  resolution_status = 'C',
  win_db_id = 4,
  win_rrep_sync_id = 1
WHERE pk_value = 'deptno=50'
  AND conflict_type = 'II'
  AND resolution_status = 'P';

The following are the updated xdb_conflicts entries:

Resolved conflict in xdb_conflicts

When viewed in the Conflict History tab, the entries now show Resolved instead of Pending in the Resolution Status column, and the Winning DB column shows the address of primary node MMRnode_b.

Correction using new transactions

You can bring all the publication tables to a consistent state by removing any changes caused by the conflicting transactions. Then issue new, corrected transactions at one primary node, which you allow the multi-master replication system to synchronize to all other primary nodes.

Referring back to the uniqueness conflict on the dept table, instead of correcting the erroneous row and inserting the row into the primary node where it is missing as described in Manual publication table correction, you can delete the conflicting rows from all primary nodes and then insert the correct row in one primary node and let the multi-master replication system synchronize the correct row to all primary nodes.

Step 1

Manually delete the inserted row from the publication tables in all primary nodes with session_replication_role set to replica.

On MMRnode_a, delete the erroneous row:

MMRnode_a=# SHOW session_replication_role;
Output
 session_replication_role
--------------------------
 replica
(1 row)


  MMRnode_a=# SELECT * FROM dept;
   deptno |   dname    |   loc
  --------+------------+----------
       10 | ACCOUNTING | NEW YORK
       20 | RESEARCH   | DALLAS
       30 | SALES      | CHICAGO
       40 | OPERATIONS | BOSTON
       50 | FINANCE    | CHICAGO
  (5 rows)
MMRnode_a=# DELETE FROM dept WHERE deptno = 50;
DELETE 1
MMRnode_a=# SELECT * FROM dept;
Output
 deptno |   dname    |   loc
--------+------------+----------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON
(4 rows)

On MMRnode_b, delete the row even though the transaction created the correct result:

MMRnode_b=# SHOW session_replication_role;
Output
 session_replication_role
--------------------------
 replica
(1 row)
MMRnode_b=# SELECT * FROM dept;
Output
 deptno |   dname    |     loc
--------+------------+-------------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON
     50 | MARKETING  | LOS ANGELES
(5 rows)
MMRnode_b=# DELETE FROM dept WHERE deptno = 50;
DELETE 1
MMRnode_b=# SELECT * FROM dept;
Output
deptno |   dname    |   loc
--------+------------+----------
   10 | ACCOUNTING | NEW YORK
   20 | RESEARCH   | DALLAS
   30 | SALES      | CHICAGO
   40 | OPERATIONS | BOSTON
(4 rows)

On MMRnode_c, no changes are required as the conflicting transaction didn't insert a new row into the table on this node:

MMRnode_c=# SET search_path TO edb;
SET
MMRnode_c=# SELECT * FROM dept;
Output
 deptno |   dname    |   loc
--------+------------+----------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON
(4 rows)

Step 2

Rerun the transaction on one primary node with the multi-master replication system running and with session_replication_role set to the default (origin).

For this example, the correct INSERT statement is executed on MMRnode_a:

MMRnode_a=# SHOW session_replication_role;
Output
 session_replication_role
--------------------------
 origin
(1 row)
MMRnode_a=# INSERT INTO dept VALUES (50, 'MARKETING', 'LOS ANGELES');
INSERT 0 1
MMRnode_a=# SELECT * FROM dept;
Output
 deptno |   dname    |     loc
--------+------------+-------------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON
     50 | MARKETING  | LOS ANGELES
(5 rows)

Step 3

Perform synchronization replication.

The same rows now appear in the publication table on all primary nodes. On MMRnode_a:

MMRnode_a=# SELECT * FROM dept;
Output
 deptno |   dname    |     loc
--------+------------+-------------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON
     50 | MARKETING  | LOS ANGELES
(5 rows)

On MMRnode_b:

MMRnode_b=# SELECT * FROM dept;
Output
 deptno |   dname    |     loc
--------+------------+-------------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON
     50 | MARKETING  | LOS ANGELES
(5 rows)

On MMRnode_c:

MMRnode_c=# SELECT * FROM dept;
Output
 deptno |   dname    |     loc
--------+------------+-------------
     10 | ACCOUNTING | NEW YORK
     20 | RESEARCH   | DALLAS
     30 | SALES      | CHICAGO
     40 | OPERATIONS | BOSTON
     50 | MARKETING  | LOS ANGELES
(5 rows)

Step 4

Update the shadow table entries for the conflicting transactions in the primary nodes to indicate that the conflict was resolved as in Step 2 of Manual publication table correction.

Change the rrep_tx_conflict_status column from P (pending) to D (discarded) on all primary nodes.

Note

The second entry for the accepted transaction you ran in Step 2, where rrep_tx_conflict_status is set to null, indicates there was no conflict.

There's no shadow table entry in MMRnode_c, since an insert transaction wasn't performed in that primary node by the application.

Step 5

In the control schema of the publication database currently designated as the controller database, modify the entries in the xdb_conflicts table to indicate the conflict was resolved as in Step 3 of Manual publication table correction.

Correction using shadow table transactions

You can bring all publication tables to a consistent state is by removing changes caused by the conflicting transactions. You then modify the publication table’s metadata in such a way that the next synchronization results in the replication of transactions already stored in the shadow tables.

Such transactions might not have successfully replicated to all the other primary nodes in a prior synchronization for various reasons.

The following is an example of such a case:

  • Applications on two primary nodes insert rows with the same primary key value. This results in a uniqueness conflict when synchronization replication occurs.
  • Following the insert on one primary node, the application continues to apply updates to the newly inserted row. These updates are successfully applied to the row on this primary node and are recorded in the shadow table on this node.
  • Synchronization replication is performed.
  • Since there is a uniqueness conflict, the rows with the conflicting primary key value aren't replicated into the publication tables on the other primary nodes.
  • However, the conflicting row on the primary node that wasn't directly updated receives those update transactions by the replication, resulting in possibly inconsistent, updated rows on the two primary nodes.

Two options are:

  • Manually insert the missing row into the other primary nodes and manually change the incorrect row.
  • Rerun the application to reapply the correct insert and updates.

However, the following option provides a way to reapply the transactions already recorded in the shadow table of the winning primary node.

The example used to illustrate this method is based upon the following transactions on the emp table.

In MMRnode_b, the following row is inserted:

INSERT INTO emp (empno,ename,job,deptno) VALUES (9001,'SMITH','ANALYST',20);

In MMRnode_c, the following row is inserted with the same primary key value 9001 in the empno column:

INSERT INTO emp (empno,ename,job,deptno) VALUES (9001,'JONES','SALESMAN',30);
In MMRnode_c, this is followed by a series of updates to the newly inserted row:
UPDATE emp SET mgr = 7698 WHERE empno = 9001;
UPDATE emp SET sal = 9500 WHERE empno = 9001;
UPDATE emp SET comm = 5000 WHERE empno = 9001;

Synchronization replication is performed. The resulting content of the emp table is as follows:

On MMRnode_a the conflicting row has not been replicated:

MMRnode_a=# SELECT * FROM emp;
Output
 empno | ename  |    job    | mgr  |      hiredate      |   sal   |  comm   | deptno
-------+--------+-----------+------+--------------------+---------+---------+--------
  7369 | SMITH  | CLERK     | 7902 | 17-DEC-80 00:00:00 |  800.00 |         |     20
  7499 | ALLEN  | SALESMAN  | 7698 | 20-FEB-81 00:00:00 | 1600.00 |  300.00 |     30
  7521 | WARD   | SALESMAN  | 7698 | 22-FEB-81 00:00:00 | 1250.00 |  500.00 |     30
  7566 | JONES  | MANAGER   | 7839 | 02-APR-81 00:00:00 | 2975.00 |         |     20
  7654 | MARTIN | SALESMAN  | 7698 | 28-SEP-81 00:00:00 | 1250.00 | 1400.00 |     30
  7698 | BLAKE  | MANAGER   | 7839 | 01-MAY-81 00:00:00 | 2850.00 |         |     30
  7782 | CLARK  | MANAGER   | 7839 | 09-JUN-81 00:00:00 | 2450.00 |         |     10
  7788 | SCOTT  | ANALYST   | 7566 | 19-APR-87 00:00:00 | 3000.00 |         |     20
  7839 | KING   | PRESIDENT |      | 17-NOV-81 00:00:00 | 5000.00 |         |     10
  7844 | TURNER | SALESMAN  | 7698 | 08-SEP-81 00:00:00 | 1500.00 |    0.00 |     30
  7876 | ADAMS  | CLERK     | 7788 | 23-MAY-87 00:00:00 | 1100.00 |         |     20
  7900 | JAMES  | CLERK     | 7698 | 03-DEC-81 00:00:00 |  950.00 |         |     30
  7902 | FORD   | ANALYST   | 7566 | 03-DEC-81 00:00:00 | 3000.00 |         |     20
  7934 | MILLER | CLERK     | 7782 | 23-JAN-82 00:00:00 | 1300.00 |         |     10
(14 rows)

On MMRnode_b the conflicting row inserted on this node remains, but is updated with the transactions replicated from MMRnode_c:

MMRnode_b=# SELECT * FROM emp;
Output
 empno | ename  |    job    | mgr  |      hiredate      |   sal   |  comm   | deptno
-------+--------+-----------+------+--------------------+---------+---------+--------
  7369 | SMITH  | CLERK     | 7902 | 17-DEC-80 00:00:00 |  800.00 |         |     20
  7499 | ALLEN  | SALESMAN  | 7698 | 20-FEB-81 00:00:00 | 1600.00 |  300.00 |     30
  7521 | WARD   | SALESMAN  | 7698 | 22-FEB-81 00:00:00 | 1250.00 |  500.00 |     30
  7566 | JONES  | MANAGER   | 7839 | 02-APR-81 00:00:00 | 2975.00 |         |     20
  7654 | MARTIN | SALESMAN  | 7698 | 28-SEP-81 00:00:00 | 1250.00 | 1400.00 |     30
  7698 | BLAKE  | MANAGER   | 7839 | 01-MAY-81 00:00:00 | 2850.00 |         |     30
  7782 | CLARK  | MANAGER   | 7839 | 09-JUN-81 00:00:00 | 2450.00 |         |     10
  7788 | SCOTT  | ANALYST   | 7566 | 19-APR-87 00:00:00 | 3000.00 |         |     20
  7839 | KING   | PRESIDENT |      | 17-NOV-81 00:00:00 | 5000.00 |         |     10
  7844 | TURNER | SALESMAN  | 7698 | 08-SEP-81 00:00:00 | 1500.00 |    0.00 |     30
  7876 | ADAMS  | CLERK     | 7788 | 23-MAY-87 00:00:00 | 1100.00 |         |     20
  7900 | JAMES  | CLERK     | 7698 | 03-DEC-81 00:00:00 |  950.00 |         |     30
  7902 | FORD   | ANALYST   | 7566 | 03-DEC-81 00:00:00 | 3000.00 |         |     20
  7934 | MILLER | CLERK     | 7782 | 23-JAN-82 00:00:00 | 1300.00 |         |     10
  9001 | SMITH  | ANALYST   | 7698 |                    | 9500.00 | 5000.00 |     20
(15 rows)

On MMRnode_c the conflicting row inserted on this node remains along with the updates performed on this node:

MMRnode_c=# SELECT * FROM emp;
Output
 empno | ename  |    job    | mgr  |      hiredate      |   sal   |  comm   | deptno
-------+--------+-----------+------+--------------------+---------+---------+--------
  7369 | SMITH  | CLERK     | 7902 | 17-DEC-80 00:00:00 |  800.00 |         |     20
  7499 | ALLEN  | SALESMAN  | 7698 | 20-FEB-81 00:00:00 | 1600.00 |  300.00 |     30
  7521 | WARD   | SALESMAN  | 7698 | 22-FEB-81 00:00:00 | 1250.00 |  500.00 |     30
  7566 | JONES  | MANAGER   | 7839 | 02-APR-81 00:00:00 | 2975.00 |         |     20
  7654 | MARTIN | SALESMAN  | 7698 | 28-SEP-81 00:00:00 | 1250.00 | 1400.00 |     30
  7698 | BLAKE  | MANAGER   | 7839 | 01-MAY-81 00:00:00 | 2850.00 |         |     30
  7782 | CLARK  | MANAGER   | 7839 | 09-JUN-81 00:00:00 | 2450.00 |         |     10
  7788 | SCOTT  | ANALYST   | 7566 | 19-APR-87 00:00:00 | 3000.00 |         |     20
  7839 | KING   | PRESIDENT |      | 17-NOV-81 00:00:00 | 5000.00 |         |     10
  7844 | TURNER | SALESMAN  | 7698 | 08-SEP-81 00:00:00 | 1500.00 |    0.00 |     30
  7876 | ADAMS  | CLERK     | 7788 | 23-MAY-87 00:00:00 | 1100.00 |         |     20
  7900 | JAMES  | CLERK     | 7698 | 03-DEC-81 00:00:00 |  950.00 |         |     30
  7902 | FORD   | ANALYST   | 7566 | 03-DEC-81 00:00:00 | 3000.00 |         |     20
  7934 | MILLER | CLERK     | 7782 | 23-JAN-82 00:00:00 | 1300.00 |         |     10
  9001 | JONES  | SALESMAN  | 7698 |                    | 9500.00 | 5000.00 |     30
(15 rows)

In this example, it is assumed that the desired, correct row is on MMRnode_c.

The following are the steps to reproduce the correct row, currently on MMRnode_c, to the other primary nodes by synchronizing the shadow table entries that resulted from the original insert and updates to this row on MMRnode_c.

Step 1

Manually delete the inserted row from the publication tables on all primary nodes except for MMRnode_c, which has the correct row. Be sure session_replication_role is set to replica.

On MMRnode_a, this row does not exist:

MMRnode_a=# SELECT * FROM emp WHERE empno = 9001;
Output
 empno | ename | job | mgr | hiredate | sal | comm | deptno
-------+-------+-----+-----+----------+-----+------+--------
(0 rows)

On MMRnode_b, delete the erroneous row:

MMRnode_a=# SHOW session_replication_role;
Output
 session_replication_role
--------------------------
 replica
(1 row)
MMRnode_b=# DELETE FROM emp WHERE empno = 9001;
DELETE 1

On MMRnode_c, leave the correct, accepted row intact:

MMRnode_c=# SELECT * FROM emp WHERE empno = 9001;
Output
 empno | ename |   job    | mgr  | hiredate |   sal   |  comm   | deptno
-------+-------+----------+------+----------+---------+---------+--------
  9001 | JONES | SALESMAN | 7698 |          | 9500.00 | 5000.00 |     30
(1 row)

Step 2

On the primary nodes containing the conflicting row to discard, mark the shadow table entry for that row as discarded. This mark indicates the conflict on this row was resolved and ensures this shadow table entry isn't replicated in the future.

Change the rrep_tx_conflict_status column from P (pending) to D (discarded) on the losing node, MMRnode_b.

Step 3

On winning node MMRnode_c, inspect the shadow table for the emp publication table.

The objective is to use the shadow table entries for the insert and three update transactions that were previously run on this node to replicate to the other primary nodes during the next synchronization.

The left-most columns of the shadow table appear as follows:

Shadow table with multiple transactions

Make note of the rrep_sync_id values for these four entries, which are 1, 2, 3, and 4 in this example.

The following shows the right-most columns of the shadow table from the figure. Note the contents of column rrep_tx_conflict_status furthest to the right.

Shadow table with multiple transactions (continued)

Make sure the rrep_tx_conflict_status column is null for these four entries. In this case, for the insert transaction, you need to change the P (pending) value to null.

The resulting change for the rrep_tx_conflict_status column in the shadow table on MMRnode_c is shown by the following:

Shadow table transactions set to replicate

Step 4

To replicate these four shadow table entries during the next synchronization, you must add one or more entries to the control schema table _edb_replicator_pub.rrep_MMR_txset on MMRnode_c to indicate pending status for synchronization to the target primary nodes (MMRnode_a and MMRnode_b) of the four shadow table entries. These shadow table entries are identified by the rrep_sync_id values of 1, 2, 3, and 4 noted in Step 3.

First, you must identify the pub_id and target db_id values to associate with the pending transactions. To do so, invoke the following query, substituting the rrep_sync_id values for sync_id in the query:

SELECT pub_id, db_id AS target_db_id
  FROM _edb_replicator_pub.rrep_MMR_txset
  WHERE start_rrep_sync_id <= sync_id
    AND end_rrep_sync_id >= sync_id;

In this example, there are four values to substitute for sync_id, which are 1, 2, 3, and 4.

The results are the following:

MMRnode_c=# SELECT pub_id, db_id AS target_db_id
MMRnode_c-#   FROM _edb_replicator_pub.rrep_MMR_txset
MMRnode_c-#   WHERE start_rrep_sync_id <= 1 AND end_rrep_sync_id >= 1;
Output
 pub_id | target_db_id
--------+--------------
      3 |            1
      3 |            4
(2 rows)
MMRnode_c=# SELECT pub_id, db_id AS target_db_id
MMRnode_c-#   FROM _edb_replicator_pub.rrep_MMR_txset
MMRnode_c-#   WHERE start_rrep_sync_id <= 2 AND end_rrep_sync_id >= 2;
Output
 pub_id | target_db_id
--------+--------------
      3 |            1
      3 |            4
(2 rows)
MMRnode_c=# SELECT pub_id, db_id AS target_db_id
MMRnode_c-#   FROM _edb_replicator_pub.rrep_MMR_txset
MMRnode_c-#   WHERE start_rrep_sync_id <= 3 AND end_rrep_sync_id >= 3;
Output
 pub_id | target_db_id
--------+--------------
      3 |            1
      3 |            4
(2 rows)
MMRnode_c=# SELECT pub_id, db_id AS target_db_id
MMRnode_c-#   FROM _edb_replicator_pub.rrep_MMR_txset
MMRnode_c-#   WHERE start_rrep_sync_id <= 4 AND end_rrep_sync_id >= 4;
Output
 pub_id | target_db_id
--------+--------------
      3 |            1
      3 |            4
(2 rows)

The results indicate that the previously executed synchronization that attempted to apply the shadow table transactions identified by the rrep_sync_id values of 1, 2, 3, and 4 were all for the publication identified by pub_id of 3. The target primary nodes were identified by db_id of 1 (for MMRnode_a) and db_id of 4 (for MMRnode_b).

Thus, you must insert at least two entries into the control schema table _edb_replicator_pub.rrep_MMR_txset on MMRnode_c. At least one entry is required for the target db_id of 1 and at least one entry for the target db_id of 4.

Each entry in _edb_replicator_pub.rrep_MMR_txset consists of a range of rrep_sync_id values (identified by columns start_rrep_sync_id and end_rrep_sync_id). The desired shadow table rrep_sync_id values happen to be contiguous (1 through 4). Thus a single entry can encompass the four rrep_sync_id values for a single target database.

In this example, you can add a total of two entries to _edb_replicator_pub.rrep_MMR_txset, one for each target database.

Note

When multiple, noncontiguous rrep_sync_id values are required for synchronization (for example, 1, 2, 5, and 6), multiple entries are required for each target database. The entries specify rrep_sync_id ranges to collectively cover all of the noncontiguous values but omitting rrep_sync_id values not included in the synchronization (for example, one entry for 1 through 2 and a second entry for 5 through 6).

Step 5

Insert the entries into the _edb_replicator_pub.rrep_MMR_txset control schema table identified in the preceding step.

The two INSERT statements invoked on MMRnode_c are the following:

INSERT INTO _edb_replicator_pub.rrep_MMR_txset (set_id, pub_id, db_id, status, start_rrep_sync_id, end_rrep_sync_id)
  values (nextval('_edb_replicator_pub.rrep_txset_seq'),3,1,'P',1,4);

INSERT INTO _edb_replicator_pub.rrep_MMR_txset (set_id, pub_id, db_id, status, start_rrep_sync_id, end_rrep_sync_id)
  values (nextval('_edb_replicator_pub.rrep_txset_seq'),3,4,'P',1,4);

A query of the _edb_replicator_pub.rrep_MMR_txset metadata table displays the following:

MMRnode_c=# SELECT set_id, pub_id, db_id AS target_db_id, status,
MMRnode_c-#     start_rrep_sync_id, end_rrep_sync_id
MMRnode_c-#   FROM _edb_replicator_pub.rrep_MMR_txset;
Output
 set_id | pub_id | target_db_id | status | start_rrep_sync_id | end_rrep_sync_id
--------+--------+--------------+--------+--------------------+------------------
      1 |      3 |            1 | C      |                  1 |                4
      1 |      3 |            4 | C      |                  1 |                4
      2 |      3 |            1 | P      |                  1 |                4
      3 |      3 |            4 | P      |                  1 |                4
(4 rows)

There are now two new entries with pending status: one for target db_id 1, the other for target db_id 4. Both entries cover the rrep_sync_id range of 1 through 4.

The two entries with completed status are from the synchronization attempt that first produced the conflict.

Step 6

Perform synchronization replication.

The insert and three update transactions recorded in the rrst_edb_emp shadow table on MMRnode_c are replicated to the other primary nodes.

On MMRnode_a:
MMRnode_a=# SELECT * FROM emp WHERE empno = 9001;
Output
 empno | ename |   job    | mgr  | hiredate |   sal   |  comm   | deptno
-------+-------+----------+------+----------+---------+---------+--------
  9001 | JONES | SALESMAN | 7698 |          | 9500.00 | 5000.00 |     30
(1 row)
On MMRnode_b:
MMRnode_b=# SELECT * FROM emp WHERE empno = 9001;
Output
 empno | ename |   job    | mgr  | hiredate |   sal   |  comm   | deptno
-------+-------+----------+------+----------+---------+---------+--------
  9001 | JONES | SALESMAN | 7698 |          | 9500.00 | 5000.00 |     30
(1 row)

These rows now match the row created by the original transactions on MMRnode_c:

MMRnode_c=# SELECT * FROM emp WHERE empno = 9001;
Output
 empno | ename |   job    | mgr  | hiredate |   sal   |  comm   | deptno
-------+-------+----------+------+----------+---------+---------+--------
  9001 | JONES | SALESMAN | 7698 |          | 9500.00 | 5000.00 |     30
(1 row)

Step 7

In the control schema of the publication database currently designated as the controller database, modify the entries in the xdb_conflicts table to indicate the conflict was resolved as in Step 3 of Manual publication table correction.

For a uniqueness (insert/insert) conflict only, the following query on the xdb_conflicts table in the controller database can display the conflicts:

MMRnode_a=# SELECT * FROM _edb_replicator_pub.xdb_conflicts
MMRnode_a-#   WHERE pk_value = 'empno=9001'
MMRnode_a-#     AND conflict_type = 'II'
MMRnode_a-#     AND resolution_status = 'P';
Output
-[ RECORD 1 ]-------+---------------------------------------------------------------
src_db_id           | 4
target_db_id        | 56
src_rrep_sync_id    | 1
target_rrep_sync_id | 0
table_id            | 32
conflict_time       | 25-AUG-15 15:27:20.928679
resolution_status   | P
resolution_strategy |
resolution_time     |
alert_status        |
conflict_type       | II
win_db_id           | 0
win_rrep_sync_id    | 0
notes               | ERROR: duplicate key value violates unique constraint "emp_pk"
                    |   Detail: Key (empno)=(9001) already exists.
pk_value            | empno=9001
-[ RECORD 2 ]-------+---------------------------------------------------------------
src_db_id           | 56
target_db_id        | 4
src_rrep_sync_id    | 1
target_rrep_sync_id | 0
table_id            | 32
conflict_time       | 25-AUG-15 15:27:20.970959
resolution_status   | P
resolution_strategy |
resolution_time     |
alert_status        |
conflict_type       | II
win_db_id           | 0
win_rrep_sync_id    | 0
notes               | ERROR: duplicate key value violates unique constraint "emp_pk"
                    |   Detail: Key (empno)=(9001) already exists.
pk_value            | empno=9001

The following SQL statement changes the value in column resolution_status from P (pending) to C (completed) to indicate this conflict was resolved. The value in winning_db_id changes to 56 to indicate primary node MMRnode_c contains the winning transaction. The value in winning_rrep_sync_id is changed to the value of rrep_sync_id for the shadow table entry of the INSERT transaction in MMRnode_c, since this is the correct one.

UPDATE _edb_replicator_pub.xdb_conflicts SET
  resolution_status = 'C',
  win_db_id = 56,
  win_rrep_sync_id = 1
WHERE pk_value = 'empno=9001'
  AND conflict_type = 'II'
  AND resolution_status = 'P';

When viewed in the Conflict History tab, the entry now shows Resolved in the Resolution Status column, and the Winning DB column shows the address of primary node MMRnode_c.