Parallel Apply v5

What is Parallel Apply?

Parallel Apply is a feature of PGD that allows a PGD node to use multiple writers per subscription. This behavior generally increases the throughput of a subscription and improves replication performance.

Configuring Parallel Apply

Two variables control Parallel Apply in PGD 5: bdr.max_writers_per_subscription (defaults to 8) and bdr.writers_per_subscription (defaults to 2).

bdr.max_writers_per_subscription = 8
bdr.writers_per_subscription = 2

This configuration gives each subscription two writers. However, in some circumstances, the system might allocate up to eight writers for a subscription.

You can only change bdr.max_writers_per_subscription with a server restart.

You can change bdr.writers_per_subscription for a specific subscription without a restart by:

Halting the subscription using bdr.alter_subscription_disable.
Setting the new value.
Resuming the subscription using bdr.alter_subscription_enable.

First though, establish the name of the subscription using select * from bdr.subscription. For this example, the subscription name is bdr_bdrdb_bdrgroup_node2_node1.

SELECT bdr.alter_subscription_disable ('bdr_bdrdb_bdrgroup_node2_node1');

UPDATE bdr.subscription
SET num_writers = 4
WHERE sub_name = 'bdr_bdrdb_bdrgroup_node2_node1';

SELECT bdr.alter_subscription_enable ('bdr_bdrdb_bdrgroup_node2_node1');

When to use Parallel Apply

Parallel Apply is always on by default and, for most operations, we recommend leaving it on.

Monitoring Parallel Apply

To support Parallel Apply's deadlock mitigation, PGD 5.2 adds columns to bdr.stat_subscription. The new columns are nprovisional_waits, ntuple_waits, and ncommmit_waits. These are metrics that indicate how well Parallel Apply is managing what previously would have been deadlocks. They don't reflect overall system performance.

The nprovisional_waits value reflects the number of operations on the same tuples being performed by concurrent apply transactions. These are provisional waits that aren't actually waiting yet but could.

If a tuple's write needs to wait until it can be safely applied, it's counted in ntuple_waits. Fully applied transactions that waited before being committed are counted in ncommit_waits.

Disabling Parallel Apply

To disable Parallel Apply, set bdr.writers_per_subscription to 1.

Deadlock mitigation

When Parallel Apply is operating, the transactional changes from the subscription are written by multiple writers. However, each writer ensures that the final commit of its transaction doesn't violate the commit order as executed on the origin node. If there's a violation, an error is generated and the transaction can be rolled back.

This mechanism previously meant that when the following are all true, the resulting error could manifest as a deadlock:

A transaction is pending commit and modifies a row that another transaction needs to change.
That other transaction executed on the origin node before the pending transaction did.
The pending transaction took out a lock request.

Additionally, handling the error could increase replication lag, due to a combination of the time taken:

To detect the deadlock
For the client to roll back its transaction
For indirect garbage collection of the changes that were already applied
To redo the work

This is where Parallel Apply’s deadlock mitigation, introduced in PGD 5.2, can help. For any transaction, Parallel Apply looks at transactions already scheduled for any row (tuple) that the current transaction wants to write. If it finds one, the row is marked as needing to wait until the other transaction is committed before applying its change to the row. This approach ensures that rows are written in the correct order.

Parallel Apply Support

Up to and including PGD 5.1, don't use Parallel Apply with Group Commit, CAMO, and Eager Replication. Disable Parallel Apply in these scenarios. If, using PGD 5.1 or earlier, you're experiencing a large number of deadlocks, you might also want to disable Parallel Apply or consider upgrading.

From PGD 5.2, Parallel Apply works with CAMO. It isn't compatible with Group Commit or Eager Replication, so disable it if Group Commit or Eager Replication are in use.