Using Failover Manager with virtual IP addresses v4
Failover Manager can be used along with a virtual IP address (VIP) for routing requests to the current primary node.
Cloud provider support and alternatives
Virtual IP addresses aren't supported by many cloud providers. In those environments, use another mechanism, such as an elastic IP address on AWS, that can be changed when needed by a fencing or post-promotion script.
Failover Manager will not drop the virtual IP address from the primary node when the agent for that node shuts down. As a convenience for testing, the primary node's agent will acquire the VIP during startup if the node does not already have it, but otherwise starting and stopping Failover Manager has no effect on whether the node holds the virtual IP address.
This allows you to upgrade and perform maintenance on EFM services without interrupting access to the database.
The VIP should be initially assigned to the primary node. When EFM detects failure of the primary node's database, it will release the VIP and then assign it to a standby node as that node is promoted to be the new primary.
EFM verifies (via the command configured via the
ping.server.command cluster property) that the VIP is not currently in use during promotion of a standby, and will not promote a new primary node until or unless the ping indicates the VIP is unreachable. You can disable this behavior via the
check.vip.before.promotion cluster property.
Meaning of the ping command exit code
Failover Manager uses the exit code of the ping command to determine whether an address is reachable. A zero exit code indicates the address is reachable (in this context, this means the VIP is assigned). A non-zero exit code indicates the address isn't reachable (in this context, this means the VIP is unassigned).
If a VIP address or any address other than the
bind.address is assigned to a node, the operating system can choose the source address used when contacting the database. Be sure to modify the
pg_hba.conf file on all monitored databases to allow contact from all addresses within your replication scenario.
The network interface used for the VIP doesn't have to be the same interface used for the Failover Manager agent's
bind.address value. The primary agent drops the VIP as needed during a failover, and Failover Manager verifies that the VIP is no longer available before promoting a standby. A failure of the bind address network leads to primary isolation and failover.
If the VIP uses a different interface from the
bind.address, you might encounter a timing condition in which the rest of the cluster checks for a reachable VIP before the primary agent drops it. In this case, Failover Manager retries the VIP check for the number of seconds specified in the
node.timeout property to help ensure that a failover happens as expected.
Failover Manager uses the
efm_address script to assign or release a virtual IP address.
The script resides in:
Failover Manager uses the following command variations to assign or release an IPv4 or IPv6 IP address.
To assign a virtual IPv4 IP address:
To assign a virtual IPv6 IP address:
To release a virtual address:
<interface_name> matches the name specified in the
virtual.ip.interface property in the cluster properties file.
<IPv6_addr> matches the value specified in the
virtual.ip property in the cluster properties file.
prefix matches the value specified in the
virtual.ip.prefix property in the cluster properties file.
For more information about properties that describe a virtual IP address, see The cluster properties file.
efm_address script as the root user. The efm user is created during the installation and is granted privileges in the sudoers file to run the
efm_address script. For more information about the
sudoers file, see Extending Failover Manager permissions.
When using a virtual IP (VIP) address with Failover Manager, it's important to test the VIP functionality manually before starting Failover Manager. This catches any network-related issues before they cause a problem during an actual failover.
While testing the VIP, make sure that Failover Manager isn't running.
The following steps test the actions that Failover Manager takes. The example uses the following property values:
virtual.ip.prefix specifies the number of significant bits in the virtual IP address.
When instructed to ping the VIP from a node, use the command defined by the
ping.server.command property and run it from the machine configured in EFM for the appropriate role (primary / secondary / witness).
Ping the VIP from all nodes to confirm that the address isn't already in use:
You will see 100% packet loss when the address is unused.
Meaning of the ping command exit code for unreachable addresses
Failover Manager uses the exit code of the ping command to determine whether the address was reachable. In this case, the exit code isn't zero. If you're using a command other than ping, it must return a non-zero exit code when the address isn't reachable.
efm_address add4command on the machine configured as the primary node to assign the VIP, and then confirm with ip address:
Ping the VIP from the other nodes to verify that they can reach the VIP:
You will see 0% packet loss, indicating the IP now reaches the machine configured as the primary node.
Meaning of the ping command exit code for reachable addresses
Failover Manager uses the exit code of the ping command to determine whether the address was reachable. In this case, the exit code is zero. If you're using a command other than ping, it must return a zero exit code when the address is reachable.
efm_address delcommand to release the address on the primary node and confirm the VIP was released with the
The output from this step will no longer show the VIP address on the eth0 interface.
Repeat step 3, this time verifying that the standby and witness don't see the VIP in use:
100% packet loss occurs. Repeat this step on all nodes.
Repeat steps 2, 3 and 4 on all standby nodes to verify that the VIP can be successfully assigned to and released from every node. You can ping the VIP from any node to verify that it's in use.
After these test steps, release the VIP from any nonprimary node before attempting to start Failover Manager.