Transaction streaming v4
With logical replication, transactions are decoded concurrently on the publisher
but aren't sent to subscribers until the transaction is committed. If the
logical_decoding_work_mem (PostgreSQL 13 and later), they're
spilled to disk. This means that, particularly with large transactions, there's
some delay before they reach subscribers and might entail additional I/O
on the publisher.
Beginning with PostgreSQL 14, transactions can optionally be decoded and sent to subscribers before they're committed on the publisher. The subscribers save the incoming changes to a staging file (or set of files) and apply them when the transaction commits (or discard them if the transaction aborts). This makes it possible to apply transactions on subscribers as soon as the transaction commits.
PostgreSQL's built-in transaction streaming has the following limitations:
- While you no longer need to spill changes to disk on the publisher, you must write changes to disk on each subscriber.
- If the transaction aborts, the work (changes received by each subscriber and the associated storage I/O) is wasted.
However, starting with version 3.7, BDR supports parallel apply, enabling multiple writer processes on each subscriber. This capability is leveraged to provide the following enhancements:
- Decoded transactions can be streamed directly to a writer on the subscriber.
- Decoded transactions don't need to be stored on disk on subscribers.
- You don't need to wait for the transaction to commit before starting to apply the transaction on the subscriber.
- You must enable parallel apply.
- Workloads consisting of many small and conflicting transactions can lead to frequent deadlocks between writers.
Direct streaming to writer is still an experimental feature. Use it with caution. Specifically, it might not work well with conflict resolutions since the commit timestamp of the streaming might not be available. (The transaction might not yet have committed on the origin.)
Configure transaction streaming in two locations:
- At node level, using the GUC bdr.default_streaming_mode
- At group level, using the function bdr.alter_node_group_config()
Permitted values are:
Default value is
Changing this setting requires a restart of the pglogical receiver process for each subscription for the setting to take effect. You can achieve this with a server restart.
bdr.default_streaming_mode is set any value other than
subscriber requests transaction streaming from the publisher. How this is
provided can also depend on the group configuration setting. See
Node configuration using bdr.default_streaming_mode for details.
You can use the parameter
streaming_mode in the function bdr.alter_node_group_config()
to set the group transaction streaming configuration.
Permitted values are:
The default value is
The value of the current setting is contained in the column
from the view bdr.node_group. The value returned is
a single char type, and the possible values are
Transaction streaming is controlled at the subscriber level by the GUC
bdr.default_streaming_mode. Unless set to
off (which disables transaction
streaming), the subscriber requests transaction streaming.
If the publisher can provide transaction streaming, it
streams transactions whenever the transaction size exceeds the threshold set in
logical_decoding_work_mem. The publisher usually has no control over whether
the transactions is streamed to a file or to a writer. Except for some
situations (such as COPY), it might hint for the subscriber to stream the
transaction to a writer (if possible).
The subscriber can stream transactions received from the publisher to either a writer or a file. The decision is based on several factors:
- If parallel apply is off (
num_writers = 1), then it's streamed to a file. (writer 0 is always reserved for non-streamed transactions.)
- If parallel apply is on but all writers are already busy handling streamed transactions, then the new transaction is streamed to a file. See bdr.writers to check BDR writer status.
If streaming to a writer is possible (that is, a free writer is available), then the decision whether to stream the transaction to a writer or a file is based on the combination of group and node settings as per the following table:
If the group configuration is set to
auto, or the group
default and the node configuration is
then the transaction is streamed to a writer only if the
publisher hinted to do this.
Currently the publisher hints for the subscriber to stream to the writer for the following transaction types. These are known to be conflict free and can be safely handled by the writer.
CREATE INDEX CONCURRENTLY
You can monitor the use of transaction streaming using the bdr.stat_subscription function on the subscriber node.
nstream_writer— Number of transactions streamed to a writer.
nstream_file— Number of transactions streamed to file.
nstream_commit— Number of committed streamed transactions.
nstream_abort— Number of aborted streamed transactions.
nstream_start— Number of streamed transactions that were started.
nstream_stop— Number of streamed transactions that were fully received.