Supported failover and failure scenarios v4

Failover Manager monitors a cluster for failures that might result in failover.

Failover Manager supports a very specific and limited set of failover scenarios. Failover can occur:

  • If the primary database crashes or is shut down.
  • If the node hosting the primary database crashes or becomes unreachable.

Failover Manager makes every attempt to verify the accuracy of these conditions. If agents can't confirm that the primary database or node has failed, Failover Manager doesn't perform any failover actions on the cluster.

Failover Manager also supports a no auto-failover mode for situations in which you want Failover Manager to monitor and detect failover conditions but not perform an automatic failover to a standby. In this mode, a notification is sent to the administrator when failover conditions are met. To disable automatic failover, modify the cluster properties file, setting the auto.failover parameter to false.

Failover Manager alerts an administrator to situations that require administrator intervention but that don't merit promoting a standby database to primary.

Primary database is down

If the agent running on the primary database node detects a failure of the primary database, Failover Manager begins the process of confirming the failure.

Confirming the failure of the primary database.

If the agent on the primary node detects that the primary database has failed, all agents attempt to connect directly to the primary database. If an agent can connect to the database, Failover Manager sends a notification about the state of the primary node. If no agent can connect, the primary agent declares database failure and releases the VIP (if applicable).

If no agent can reach the virtual IP address or the database server, Failover Manager starts the failover process. The standby agent on the most up-to-date node runs a fencing script (if applicable), promotes the standby database to primary database, and assigns the virtual IP address to the standby node. Any additional standby nodes are configured to replicate from the new primary unless auto.reconfigure is set to false. If applicable, the agent runs a post-promotion script.

Return the node to the cluster

To recover from this scenario without restarting the entire cluster:

  1. Restart the database on the original primary node as a standby database.
  2. Invoke the efm resume command on the original primary node.

Return the node to the role of primary

After returning the node to the cluster as a standby, you can easily return the node to the role of primary:

  1. If the cluster has more than one standby node, use the efm set-priority command to set the node's failover priority to 1.
  2. Invoke the efm promote -switchover command to promote the node to its original role of primary node.
Note

Failover Manager doesn't rebuild a failed primary database to become a standby. Before rebuilding, it's important to determine why the primary failed, and ensure that all the data is available on the new primary. Once the server is ready to be reinstated as a standby, the old data directory can be removed and the server can be reinstated. For more information, refer to the PostgreSQL documentation on Setting up a standby server. In some cases, you can also reinstate the server using pg_rewind.

Standby database is down

If a standby agent detects a failure of its database, the agent notifies the other agents. The other agents confirm the state of the database.

Confirming the failure of a standby database.

After returning the standby database to a healthy state, invoke the efm resume command to return the standby to the cluster.

Primary agent exits or node fails

If the Failover Manager primary agent crashes or the node fails, a standby agent detects the failure and, if appropriate, initiates a failover.

Confirming the failure of the primary agent.

If an agent detects that the primary agent has left, all agents attempt to connect directly to the primary database. If any agent can connect to the database, an agent sends a notification about the failure of the primary agent. If no agent can connect, the agents attempt to ping the virtual IP address (if applicable) to determine if it was released.

If no agent can reach the virtual IP address or the database server, Failover Manager starts the failover process. The standby agent on the most up-to-date node runs a fencing script (if applicable), promotes the standby database to primary database, and assigns the virtual IP address to the standby node. If applicable, the agent runs a post-promotion script. Any additional standby nodes are configured to replicate from the new primary unless auto.reconfigure is set to false.

If this scenario occurred because the primary was isolated from network, the primary agent detects the isolation, releases the virtual IP address, and creates the recovery.conf file. Failover Manager performs these same steps on the remaining nodes of the cluster.

To recover from this scenario without restarting the entire cluster:

  1. Restart the original primary node.
  2. Bring the original primary database up as a standby node.
  3. Start the service on the original primary node.
:

Stopping an agent doesn't signal the cluster that the agent failed.

If a primary Failover Manager process fails, there's no failover protection until the agent is restarted. To avoid this case, you can set up the primary node through systemd to cause a failover when the primary agent exits. For more information, see Configuring for Eager Failover.

Standby agent exits or node fails

If a standby agent exits or a standby node fails, the other agents detect that it's no longer connected to the cluster.

Failure of standby agent.

When the failure is detected, the agents attempt to contact the database that resides on the node. If the agents confirm that there's a problem, Failover Manager sends the appropriate notification to the administrator.

If there is only one primary and one standby remaining, there's no failover protection in the case of a primary node failure. In the case of a primary database failure, the primary and standby agents can agree that the database failed and proceed with failover.

Dedicated witness agent exits/node fails

This scenario details the actions taken if a dedicated witness (a node that is not hosting a database) fails.

Confirming the failure of a dedicated witness.

When an agent detects that the witness node can't be reached, Failover Manager notifies the administrator of the state of the witness.

Note

If the witness fails and the cluster has only two nodes, then there is no failover protection. The standby node can't know if the primary failed or was disconnected. In a two-node cluster, if the primary database fails but the nodes are still connected, failover still occurs. The standby can confirm the condition of the primary database.

Nodes become isolated from the cluster

This scenario details the actions taken if one or more nodes (a minority of the cluster) become isolated from the majority of the cluster.

If members of the cluster become isolated.

If one or more nodes (but less than half of the cluster) become isolated from the rest of the cluster, the remaining cluster behaves as if the nodes have failed. The agents attempt to discern if the primary node is among the isolated nodes. If it is, the primary fences itself off from the cluster, while a standby node from within the cluster majority is promoted to replace it. Other standby nodes are configured to replicate from the new primary unless auto.reconfigure is set to false.

Failover Manager then notifies an administrator, and the isolated nodes rejoin the cluster when they can. When the nodes rejoin the cluster, the failover priority might change.