Creating Raft subgroups using TPA v5

The TPAexec configure command enables Raft subgroups if the --enable_proxy_routing local option is set. TPA uses the term locations to reflect the common use case of subgroups that map to physical/regional domains. When the configuration is generated, the location name given is stored under the generated group name, which is based on the location name.

Creating Raft subgroups using TPA

This example creates a two-location cluster with three data nodes in each location. The nodes in each location are part of a PGD Raft subgroup for the location.

The top-level group's name is pgdgroup.

The top-level group has two locations: us_east and us_west. These locations are mapped to two subgroups: us_east_subgroup and us_west_subgroup.

Each location has four nodes: three data nodes and a barman backup node. The three data nodes also cohost PGD Proxy. The configuration can be visualized like so:

6 Node Cluster with 2 Raft Subgroups

The barman nodes don't participate in the subgroup and, by extension, the Raft group. They're therefore not shown. This diagram is a snapshot of a potential state of the cluster with the West Raft group having selected west_1 as write leader and west_2 as its own Raft leader. On the East, east_1 is write leader while east_3 is Raft leader. The entire cluster is contained within the top-level Raft group. There, west_3 is currently Raft leader.

To create this configuration, you run:

tpaexec configure pgdgroup --architecture PGD-Always-ON --location-names us_east us_west --data-nodes-per-location 3 --epas 16 --no-redwood --enable_proxy_routing local --hostnames-from hostnames.txt 

Where hostnames.txt contains:

east1
east2
east3
eastbarman
west1
west2
west3
westbarman

The configuration file

The generated config.yml file has a bdr_node_groups section that contains the top-level group pgdgroup and the two subgroups us_east_subgroup and us_west_subgroup. Each of those subgroups has a location set (us_east and us_west) and two other options that are set to true:

  • enable_raft, which activates the subgroup Raft in the subgroup
  • enable_proxy_routing, which enables the pgd_proxy routers to route traffic to the subgroup’s write leader

Here's an example generated by the sample tpaexec command:

cluster_vars:
  apt_repository_list: []
  bdr_database: bdrdb
  bdr_node_group: pgdgroup
  bdr_node_groups:
  - name: pgdgroup
  - name: us_east_subgroup
    options:
      enable_proxy_routing: true
      enable_raft: true
      location: us_east
    parent_group_name: pgdgroup
  - name: us_west_subgroup
    options:
      enable_proxy_routing: true
      enable_raft: true
      location: us_west
    parent_group_name: pgdgroup
  bdr_version: '5'

Every node instance has an entry in the instances list. In that entry, bdr_child_group appears in the variables section, set to the subgroup the node belongs to. Here's an example generated by the sample tpaexec command:

instances:
- Name: east1
  backup: eastbarman
  location: us_east
  node: 1
  role:
  - bdr
  - pgd-proxy
  vars:
    bdr_child_group: us_east_subgroup
    bdr_node_options:
      route_priority: 100
- Name: east2
  location: us_east
  node: 2
  role:
  - bdr
  - pgd-proxy
  vars:
    bdr_child_group: us_east_subgroup
    bdr_node_options:
      route_priority: 100
- Name: east3
  location: us_east
  node: 3
  role:
  - bdr
  - pgd-proxy
  vars:
    bdr_child_group: us_east_subgroup
    bdr_node_options:
      route_priority: 100
- Name: eastbarman
  location: us_east
  node: 4
  role:
  - barman

The one node in this location that doesn't have a bdr_child_group setting is the barman node because it doesn't participate in the Raft decision-making process.