Cluster bootstrapping v4

To use HARP, a minimum amount of metadata must exist in the DCS. The process of "bootstrapping" a cluster essentially means initializing node, location, and other runtime configuration either all at once or on a per-resource basis.

This process is governed through the harpctl apply command. For more information, see harpctl command-line tool.

Set up the DCS and make sure it is functional before bootstrapping.

Important

You can combine any or all of these example into a single YAML document and apply it all at once.

Cluster-wide bootstrapping

Some settings are applied cluster-wide and you can specify them during bootstrapping. Currently this applies only to the event_sync_interval runtime directive, but others might be added later.

The format is as follows:

cluster: 
  name: mycluster
  event_sync_interval: 100

Assuming that file was named cluster.yml, you then apply it with the following:

harpctl apply cluster.yml

If the cluster name isn't already defined in the DCS, this also initializes that value.

Important

The cluster name parameter specified here always overrides the cluster name supplied in config.yml. The assumption is that the bootstrap file supplies all necessary elements to bootstrap a cluster or some portion of its larger configuration. A config.yml file is primarily meant to control the execution of HARP Manager, HARP Proxy, or harpctl specifically.

Location bootstrapping

Every HARP node is associated with at most one location. This location can be a single data center, a grouped region consisting of multiple underlying servers, an Amazon availability zone, and so on. This is a logical structure that allows HARP to group nodes together such that only one represents the nodes in that location as the lead master.

Thus it is necessary to initialize one or more locations. The format for this is as follows:

cluster: 
  name: mycluster

locations:
  - location: dc1
  - location: dc2

Assuming that file was named locations.yml, you then apply it with the following:

harpctl apply locations.yml

When performing any manipulation of the cluster, include the name as a preamble so the changes are directed to the right place.

Once locations are bootstrapped, they show up with a quick examination:

> harpctl get locations

Cluster   Location Leader Previous Leader Target Leader Lease Renewals 
-------   -------- ------ --------------- ------------- -------------- 
mycluster dc1                                           <nil>          
mycluster dc2                                           <nil>          

Both locations are recognized by HARP and available for node and proxy assignment.

Node bootstrapping

HARP nodes exist in a named cluster and must have a designated name. Beyond this, all other settings are retained in the DCS, as they are dynamic and can affect how HARP interacts with them. To this end, bootstrap each node using one or more of the runtime directives discussed in Configuration.

While bootstrapping a node, there are a few required fields:

  • name
  • location
  • dsn
  • pg_data_dir

Everything else is optional and can depend on the cluster. Because you can bootstrap multiple nodes at once, the format generally fits this structure:

cluster: 
  name: mycluster

nodes:
  - name: node1
    location: dc1
    dsn: host=node1 dbname=bdrdb user=postgres 
    pg_data_dir: /db/pgdata
    leader_lease_duration: 10
    priority: 500

Assuming that file was named node1.yml, you then apply it with the following:

harpctl apply node1.yml

Once nodes are bootstrapped, they show up with a quick examination:

> harpctl get nodes

Cluster     Name  Location Ready Fenced Allow Routing Routing Status Role    Type Lock Duration 
-------     ----  -------- ----- ------ ------------- -------------- ----    ---- -------------
mycluster   bdra1 dc1      true  false  true          ok             primary bdr  30

Proxy bootstrapping

Unlike locations or nodes, proxies can also supply a configuration template that is applied to all proxies in a location. These are stored in the DCS under the global designation. Each proxy also requires a name to exist as an instance, but no further customization is needed unless some setting needs a specific override.

This is because there are likely to be multiple proxies that have the same default configuration settings for the cluster, and repeating these values for every single proxy isn't necessary.

Additionally, when bootstrapping the proxy template, define at least one database for connection assignments. With these notes in mind, the format for this is as follows:

cluster: 
  name: mycluster

proxies:
  monitor_interval: 5
  default_pool_size: 20
  max_client_conn: 1000
  database_name: bdrdb
  instances:
    - name: proxy1
    - name: proxy2
      default_pool_size: 50

This configures HARP for two proxies: proxy1 and proxy2. Only proxy2 has a custom default_pool_size, while using the global settings otherwise.

Assuming that file was named proxy.yml, you then apply it with the following:

harpctl apply proxy.yml

Once the proxy template is bootstrapped, it shows up with a quick examination:

> harpctl get proxies

Cluster   Name   Pool Mode Auth Type Max Client Conn Default Pool Size 
-------   ----   --------- --------- --------------- ----------------- 
mycluster global session   md5       1000            20                
mycluster proxy1 session   md5       1000            20                
mycluster proxy2 session   md5       1000            50