PGDGroup parting v2.0.0

Parting nodes using annotations

The operator supports parting all nodes in a PGDGroup using the k8s.pgd.enterprisedb.io/part annotation. This provides a way to gracefully remove a PGDGroup from the PGD cluster without deleting the Kubernetes resource itself.

This is particularly useful in multi-region deployments where you need to decommission one region's nodes from the PGD cluster while keeping the Kubernetes resources intact for cleanup operations.

How to part a group

To part all nodes in a PGDGroup, add the k8s.pgd.enterprisedb.io/part annotation with the value on:

kubectl annotate pgdgroup region-b k8s.pgd.enterprisedb.io/part=on

Once the annotation is applied, the operator will:

  1. Detect the part annotation and enter the PGD - Parting node from group phase.
  2. Connect to an active PGD node (preferring the write leader) and execute bdr.part_node for every data node and witness node in the group.
  3. Mark each successfully parted node's CNP Cluster with the annotation k8s.pgd.enterprisedb.io/part=parted.
  4. Transition to the PGDGroup - Parted phase once all nodes are parted.
  5. Delete the parted CNP Cluster resources from Kubernetes.

You can monitor the progress by checking the PGDGroup status:

kubectl get pgdgroups

NAME       DATA INSTANCES   WITNESS INSTANCES   PHASE                  AGE
region-a   2                1                   PGDGroup - Healthy     25m
region-b   2                1                   PGDGroup - Parted      25m
region-c   0                1                   PGDGroup - Healthy     25m

Running cleanup after parting

After all nodes in the PGDGroup have been parted, the group metadata (e.g., entries in bdr.node_group) still remains in the global PGD catalog. You should create a PGDGroupCleanup resource to remove this metadata. See PGDGroup cleanup below for details.

A typical workflow for decommissioning a region is:

  1. Part the group using the annotation.
  2. Wait for the group to reach the PGDGroup - Parted phase.
  3. Create a PGDGroupCleanup to clean up the parted nodes and drop the group from the PGD catalog.
  4. Delete the PGDGroup Kubernetes resource.

Important considerations

Warning

The part annotation is irreversible. Once the annotation is set to on, it cannot be removed. The webhook validation rejects any update that attempts to remove the part annotation from a parted group.

Warning

While the part annotation is active, no changes to the PGDGroup .spec are allowed. The webhook rejects any spec modifications on a parted group.

Deletion and finalizers

When deleting a PGDGroup, the operator will start parting every node in the group first. It will connect to an active instance and part every node in the target group. Once a node is parted, the node will not participate in replication and consensus operations. To make sure the node is correctly parted before being deleted, the operator uses the k8s.pgd.enterprisedb.io/partNodes finalizer. Please refer to the Kubernetes documentation on finalizers for context.

Note

If a namespace holding a PGDGroup is deleted directly, the operator cannot ensure the deleting and parting sequence is carried out correctly. Before deleting a namespace, it is recommended to delete all the contained PGDGroups.

Time limit

When parting a node, the operator needs to connect to an active instance to execute the bdr.part_node function. To avoid this operation hanging, a time limit for the finalizer is used; by default, it is 300 seconds. After the time limit expires, the finalizer will be removed, and the node will be deleted anyway, potentially leaving stale metadata in the global PGD catalog. This time limit can be configured through spec.failingFinalizerTimeLimitSeconds, which is specified in seconds.

Skip finalizer

For testing purposes only, the operator also provides an annotation to skip the finalizer: k8s.pgd.enterprisedb.io/noFinalizers. When this annotation is added to a PGDGroup, the finalizer will be skipped when the PGDGroup is being deleted, and the nodes will not be parted from the PGD cluster.

PGDGroup cleanup

Clean up parted nodes

Once all the nodes belonging to a PGDGroup are parted (either via the part annotation or after deleting the PGDGroup), the group information is still available in the PGD metadata like bdr.node_group. The PGD4K operator defines a CRD named PGDGroupCleanup to help drop the PGDGroup and clean up the parted nodes belonging to this group, if any.

In the example below, the PGDGroupCleanup executes locally from region-a, and will clean up all parted nodes of region-b, with the prerequisite that all the nodes must be in the PARTED state. It will then drop the PGDGroup region-b.

apiVersion: pgd.k8s.enterprisedb.io/v1beta1
kind: PGDGroupCleanup
metadata:
  name: region-b-cleanup
spec:
  executor: region-a
  target: region-b

Please note that if the target group (region-b in the example) contains nodes not in a PARTED state, the Group Cleanup will stop in phase PGDGroupCleanup - Waiting for nodes in target PGDGroup to be parted. In cases of extreme need, you can add the force option to force part the node.

Warning

Using force can leave the PGD cluster in an inconsistent state. Use it only to recover from failures in which you can't part the group nodes any other way.

apiVersion: pgd.k8s.enterprisedb.io/v1beta1
kind: PGDGroupCleanup
metadata:
  name: region-b-cleanup
spec:
  force: true
  executor: region-a
  target: region-b