Transaction streaming v4

With logical replication, transactions are decoded concurrently on the publisher but aren't sent to subscribers until the transaction is committed. If the changes exceed logical_decoding_work_mem (PostgreSQL 13 and later), they're spilled to disk. This means that, particularly with large transactions, there's some delay before they reach subscribers and might entail additional I/O on the publisher.

Beginning with PostgreSQL 14, transactions can optionally be decoded and sent to subscribers before they're committed on the publisher. The subscribers save the incoming changes to a staging file (or set of files) and apply them when the transaction commits (or discard them if the transaction aborts). This makes it possible to apply transactions on subscribers as soon as the transaction commits.

BDR enhancements

PostgreSQL's built-in transaction streaming has the following limitations:

  • While you no longer need to spill changes to disk on the publisher, you must write changes to disk on each subscriber.
  • If the transaction aborts, the work (changes received by each subscriber and the associated storage I/O) is wasted.

However, starting with version 3.7, BDR supports parallel apply, enabling multiple writer processes on each subscriber. This capability is leveraged to provide the following enhancements:

  • Decoded transactions can be streamed directly to a writer on the subscriber.
  • Decoded transactions don't need to be stored on disk on subscribers.
  • You don't need to wait for the transaction to commit before starting to apply the transaction on the subscriber.

Caveats

  • You must enable parallel apply.
  • Workloads consisting of many small and conflicting transactions can lead to frequent deadlocks between writers.
Note

Direct streaming to writer is still an experimental feature. Use it with caution. Specifically, it might not work well with conflict resolutions since the commit timestamp of the streaming might not be available. (The transaction might not yet have committed on the origin.)

Configuration

Configure transaction streaming in two locations:

Node configuration using bdr.default_streaming_mode

Permitted values are:

  • off
  • writer
  • file
  • auto

Default value is auto.

Changing this setting requires a restart of the pglogical receiver process for each subscription for the setting to take effect. You can achieve this with a server restart.

If bdr.default_streaming_mode is set any value other than off, the subscriber requests transaction streaming from the publisher. How this is provided can also depend on the group configuration setting. See Node configuration using bdr.default_streaming_mode for details.

Group configuration using bdr.alter_node_group_config()

You can use the parameter streaming_mode in the function bdr.alter_node_group_config() to set the group transaction streaming configuration.

Permitted values are:

  • off
  • writer
  • file
  • auto
  • default

The default value is default.

The value of the current setting is contained in the column node_group_streaming_mode from the view bdr.node_group. The value returned is a single char type, and the possible values are D (default), W (writer), F (file), A (auto), and O (off).

Configuration setting effects

Transaction streaming is controlled at the subscriber level by the GUC bdr.default_streaming_mode. Unless set to off (which disables transaction streaming), the subscriber requests transaction streaming.

If the publisher can provide transaction streaming, it streams transactions whenever the transaction size exceeds the threshold set in logical_decoding_work_mem. The publisher usually has no control over whether the transactions is streamed to a file or to a writer. Except for some situations (such as COPY), it might hint for the subscriber to stream the transaction to a writer (if possible).

The subscriber can stream transactions received from the publisher to either a writer or a file. The decision is based on several factors:

  • If parallel apply is off (num_writers = 1), then it's streamed to a file. (writer 0 is always reserved for non-streamed transactions.)
  • If parallel apply is on but all writers are already busy handling streamed transactions, then the new transaction is streamed to a file. See bdr.writers to check BDR writer status.

If streaming to a writer is possible (that is, a free writer is available), then the decision whether to stream the transaction to a writer or a file is based on the combination of group and node settings as per the following table:

GroupNodeStreamed to
off(any)(none)
(any)off(none)
writerfilefile
filewriterfile
defaultwriterwriter
defaultfilefile
defaultautowriter
auto(any)writer

If the group configuration is set to auto, or the group configuration is default and the node configuration is auto, then the transaction is streamed to a writer only if the publisher hinted to do this.

Currently the publisher hints for the subscriber to stream to the writer for the following transaction types. These are known to be conflict free and can be safely handled by the writer.

  • COPY
  • CREATE INDEX CONCURRENTLY

Monitoring

You can monitor the use of transaction streaming using the bdr.stat_subscription function on the subscriber node.

  • nstream_writer Number of transactions streamed to a writer.
  • nstream_file Number of transactions streamed to file.
  • nstream_commit Number of committed streamed transactions.
  • nstream_abort Number of aborted streamed transactions.
  • nstream_start Number of streamed transactions that were started.
  • nstream_stop Number of streamed transactions that were fully received.