BDR is both a patch to PostgreSQL core and an extension on top of PostgreSQL core. How did that come about, and what’s it’s future?
Development of BDR was initiated around the time PostgreSQL 9.2 was in development. Arguably earlier if you count things like the extension mechanism. The goal of BDR is, and has always been, to add necessary features to core PostgreSQL to perform asynchronous loosely-coupled multi-master logical replication.
BDR improvements to core PostgreSQL
Since it’s such a large set of changes it was necessary to structure development as a series of discrete features. A natural dividing line was “things that require changes to the core PostgreSQL code” vs “things that can be done in an extension”. So the code was structured accordingly, making BDR a set of patches to core plus an extension that uses the added features to implement a multimaster node. The required features were developed in a branch of PostgreSQL and then extracted and submitted one by one to PostgreSQL core. BDR is, and has always been, much too big to simply commit to PostgreSQL in one go.
Getting the BDR extension running on unmodified PostgreSQL
BDR 1.0 is still in two parts – a set of patches to add necessary features to core PostgreSQL and an extension that uses the features to implement multimaster. The original goal was to have BDR running on stock 9.4 without patches, but it just took too long to get all the required patches through the community process and into core. This isn’t entirely a bad thing, since it’s led to the eventual features being of higher quality when they were committed to 9.4, 9.5 and 9.6.
Now, as of PostgreSQL 9.6, all of the patches required to make it possible to implement BDR-style replication have been included in PostgreSQL core. As mentioned in an earlier post, BDR on 9.6 should run as an extension on an unmodified PostgreSQL. The implementation of BDR itself is still an outside extension, so this doesn’t mean “BDR is in 9.6”, but it’s big progress.
Other enhancements related to BDR
Meanwhile, the “pglogical” single-master logical replication tool was extracted from BDR, enhanced, simplified, and submitted to core PostgreSQL as a further building block toward getting full multi-master logical replication into core. The submission was not successful in its original form, but the 2ndQuadrant team is now working on a replication feature for PostgreSQL 10 that is based on that work.
Once logical replication (not just logical decoding) is incorporated into PostgreSQL core, the BDR extension will be adapted to use the in-core logical replication features. Over time, more parts of the functionality of the BDR extension will be submitted to PostgreSQL until all BDR functionality is in community PostgreSQL.
This process has taken years and still has a ways to go. PostgreSQL’s development is careful and prioritizes stability. Submissions often need multiple rounds of revisions before they are accepted for community distribution. Sometimes they’re completely rejected and a different design must be pursued.
In the meantime, the BDR extension – and pglogical – will continue to meet user needs.
Features added to core PostgreSQL for BDR
Along the way, we’ve delivered some great features to PostgreSQL that other projects can use too:
- Background workers let you run tasks inside PostgreSQL, like schedulers, etc. They’ve already been used to help with things like parallel query.
- Event triggers and DDL deparse hooks let you set up user defined triggers on database change events like table creation. They’re the foundation of DDL replication in BDR and are usable by other products that are interested in tracking, auditing, and replicating changes to database structure.
- Replication slots simplify WAL retention management for physical standbys and provide a foundation for logical decoding of changes.
- Replication origins are the other side of replication slots. They let a downstream server efficiently keep track of how much they’ve replayed from a replication slot. Unlike most of these features, replication origins are not useful for much except logical replication.
- Logical decoding reads a stream of low level binary changes from WAL on a server and turns it into a stream of logical row changes in a format suitable for replication across PostgreSQL versions or even to different products and applications. It’s the foundation on which BDR, pglogical, and in-core logical replication are built.
- Logical WAL messages let extensions, user defined functions, etc write special messages into WAL that are not associated with any particular table. They may optionally bypass normal transactional rules, like a very limited version of an autonomous transaction. These are used when different nodes signal each other about changes; for example, BDR uses them for global DDL locking. These messages will also be useful to applications that adopt the logical decoding interface for change streaming, auditing, etc – for example, to record the identity of a user who performs an action when streaming audit history to Kafka.