3.2 Configuring Failover Manager

Table of Contents Previous Next


3 Installing and Configuring Failover Manager : 3.2 Configuring Failover Manager

The efm.properties file contains the properties of the individual node on which it resides, while the efm.nodes file contains a list of the current Failover Manager cluster members.
The Failover Manager installer creates a file template for the cluster properties file named efm.properties.in in the /etc/efm-2.1 directory. After completing the Failover Manager installation, you must make a working copy of the template before modifying the file contents.
The following command copies the efm.properties.in file, creating a properties file named efm.properties:
# cp /etc/efm-2.1/efm.properties.in /etc/efm-2.1/efm.properties
Please note: By default, Failover Manager expects the cluster properties file to be named efm.properties. If you name the properties file something other than efm.properties, you must modify the service script to instruct Failover Manager to use a different name.
The property files are owned by root. The Failover Manager service script expects to find the files in the /etc/efm-2.1 directory. If you move the property file to another location, you must create a symbolic link that specifies the new location.
Note that you must use the efm encrypt command to encrypt the value supplied in the db.password.encrypted parameter. For more information about encrypting a password, see Section 3.2.1.2.
Use the parameters in the efm.properties file to specify connection, administrative, and operational details for Failover Manager.
Use the efm.license parameter to provide the Failover Manager product key:
The db.user specified must have sufficient privileges to invoke selected PostgreSQL commands on behalf of Failover Manager. For more information, please see Section 2.2.
Use the db.service.owner parameter to specify the name of the operating system user that owns the cluster that is being managed by Failover Manager. This property is not required on a dedicated witness node.
Specify the name of the database server in the db.service.name parameter if you use the service or systemctl command when starting or stopping the service.
You should use the same service control mechanism (pg_ctl, service, or systemctl) each time you start or stop the database service. If you use the pg_ctl program to control the service, specify the location of the pg_ctl program in the db.bin parameter.
# Specify the directory containing the pg_ctl command, for
# example: /usr/pgsql-9.3/bin. The pg_ctl command is used to
# restart standby databases after a failover so that they are
# streaming from the new master node.
Unless the db.service.name
# property is used, the pg_ctl command is used to
# start/stop/restart databases as needed after a

# failover or switchover. This property is required unless
# db.service.name is set.
Use the db.recovery.conf.dir parameter to specify the location to which a recovery file will be written on the Master node of the cluster, and a trigger file is written on a Standby. This property is not required on a dedicated witness node.
Use the jdbc.ssl parameter to instruct Failover Manager to use SSL connections. If you have enabled SSL, use the jdbc.ssl.mode parameter to specify behaviors related to server certificates.
# Use the jdbc.ssl property to enable ssl for EFM connections.
# Setting this property to true will force the agents to use
# 'ssl=true' for all JDBC database connections (to both local
# and remote databases).

# When jdbc.ssl is true (and ssl is enabled), the jdbc.ssl.mode
# property will determine how server certificates are handled.
# Valid values are:
#

# verify-ca - EFM will perform CA verification before allowing
# the certificate. This is the default value in case
# ssl is used and the mode property is not set.

# require - Verification will not be performed on the server
# certificate.


jdbc.ssl=false

jdbc.ssl.mode=verify-ca
Use the user.email parameter to specify an email address (or multiple email addresses) that will receive any notifications sent by Failover Manager.
Use the script.notification parameter to specify the path to a user-supplied script that acts as a notification service; the script will be passed a message subject and a message body. The script will be invoked each time Failover Manager generates a user notification.
# Absolute path to script run for user notifications.
#
# This is an optional user-supplied script that can be used for
# notifications instead of email. This is required if not using
# email notifications. Either/both can be used. The script will
# be passed two parameters: the message subject and the message
# body.
The bind.address parameter specifies the IP address and port number of the agent on the current node of the Failover Manager cluster.
Use the admin.port parameter to specify a port on which Failover Manager listens for administrative commands.
Set the is.witness parameter to true to indicate that the current node is a witness node. If is.witness is true, the local agent will not check to see if a local database is running.
The Postgres pg_is_in_recovery() function is a boolean function that reports the recovery state of a database. The function returns true if the database is in recovery, or false if the database is not in recovery. When an agent starts, it connects to the local database and invokes the pg_is_in_recovery() function. If the server responds true, the agent assumes the role of standby; if the server responds false, the agent assumes the role of master. If there is no local database, the agent will assume an IDLE state.
If is.witness is true, Failover Manager will not check the recovery state.
The local.period parameter specifies how many seconds between attempts to contact the database server.
The
local.timeout parameter specifies how long an agent will wait for a positive response from the local database server.
The
local.timeout.final parameter specifies how long an agent will wait after the final attempt to contact the database server on the current node. If a response is not received from the database within the number of seconds specified by the local.timeout.final parameter, the database is assumed to have failed.
Use the remote.timeout parameter to specify how many seconds an agent waits for a response from a remote database server (i.e., how long a standby agent waits to verify that the master database is actually down before performing failover).
Use the node.timeout parameter to specify the number of seconds that an agent will wait for a response from a node when determining if a node has failed. The node.timeout parameter value specifies a timeout value for agent-to-agent communication; other timeout parameters in the cluster properties file specify values for agent-to-database communication.
Use the pingServer parameter to specify the IP address of a server that Failover Manager can use to confirm that network connectivity is not a problem.
Use the pingServerCommand parameter to specify the command used to test network connectivity.
Use the auto.allow.hosts parameter to instruct the server to use the addresses specified in the .nodes file of the first node started to update the allowed host list. Enabling this parameter (setting auto.allow.hosts to true) can simplify cluster start-up.
The db.reuse.connection.count parameter allows the administrator to specify the number of times Failover Manager reuses the same database connection to check the database health. The default value is 0, indicating that Failover Manager will create a fresh connection each time. This property is not required on a dedicated witness node.
The auto.failover parameter enables automatic failover. By default, auto.failover is set to true.
# Whether or not failover will happen automatically when the master
# fails.
Set to false if you want to receive the failover notifications
# but
not have EFM actually perform the failover steps.
# The
value of this property must be the same across all agents.
Use the auto.reconfigure parameter to instruct Failover Manager to enable or disable automatic reconfiguration of remaining Standby servers after the primary standby is promoted to Master. Set the parameter to true to enable automatic reconfiguration (the default) or false to disable automatic reconfiguration. This property is not required on a dedicated witness node.
Please note: primary_conninfo is a space-delimited list of keyword=value pairs.
Please note: If you are using replication slots to manage your WAL segments, automatic reconfiguration is not supported; you should set auto.reconfigure to false. For more information, see Section 2.2.
Use the promotable parameter to indicate that a node should not be promoted. To override the setting, use the efm set-priority command at runtime; for more information about the efm set-priority command, see Section 5.3.
Use the minimum.standbys parameter to specify the minimum number of standby nodes that will be retained on a cluster; if the standby count drops to the specified minimum, a replica node will not be promoted in the event of a failure of the master node.
Use the recovery.check.period parameter to specify the number of seconds that Failover Manager will wait before checks to see if a database is out of recovery.
Use the auto.resume.period parameter to specify the number of seconds (after a monitored database fails, and an agent has assumed an IDLE state) that an agent will attempt to resume monitoring that database.
Use the virtualIp parameter to specify virtual IP address information for the Failover Manager cluster.
Use the virtualIp.interface parameter to specify an alias for your network adaptor (for example, eth0:1 specifies an alias for the adaptor, eth0). You might create multiple aliases for each adaptor on a given host; for more information about running multiple agents on a single node, please see Section 4.3.
Use the virtualIp.netmask parameter to specify which bits in the virtual IP address refer to the network address (as opposed to the host address).
script.fence specifies the path to an optional user-supplied script that will be invoked during the promotion of a standby node to master node.
# absolute path to fencing script run during promotion
#

# This is an optional user-supplied script that will be run
# during failover on the standby database node. If left blank,
# no action will be taken. If specified, EFM will execute this
# script before promoting the standby. The script is run as the
# efm user.
#

# Parameters can be passed into this script for the failed master
# and new primary node addresses. Use %p for new primary and %f
# for failed master. On a node that has just been promoted, %p
# should be the same as the node's efm binding address.

#
# Example:
# script.fence=/somepath/myscript %p %f
#
# NOTE: FAILOVER WILL NOT OCCUR IF THIS SCRIPT RETURNS A NON-ZERO EXIT CODE.
Please note that the fencing script runs as the efm user; you must ensure that the efm user has sufficient privileges to invoke any commands included in the fencing script. For more information about Failover Manager permissions, please see Section 3.1.
Use the script.post.promotion parameter to specify the path to an optional user-supplied script that will be invoked after a standby node has been promoted to master.
# Absolute path to fencing script run after promotion
#

# This is an optional user-supplied script that will be run after
# failover on the standby node after it has been promoted and
# is no longer in recovery. The exit code from this script has
# no effect on failover manager, but will be included in a
# notification sent after the script executes. The script is run
# as the efm user.

#
# Parameters can be passed into this script for the failed master
# and new primary node addresses. Use %p for new primary and %f
# for failed master. On a node that has just been promoted, %p
# should be the same as the node's efm binding address.

#
# Example:
# script.post.promotion=/somepath/myscript %f %p
Use the script.resumed parameter to specify an optional path to a user-supplied script that will be invoked when an agent resumes monitoring of a database.
Use the script.db.failure parameter to specify the complete path to an optional user-supplied script that Failover Manager will invoke if an agent detects that the database that it monitors has failed.
Use the script.master.isolated parameter to specify the complete path to an optional user-supplied script that Failover Manager will invoke if the agent monitoring the master database detects that the master is isolated from the majority of the Failover Manager cluster. This script is called immediately after the VIP is released (if a VIP is in use).
Use the sudo.command parameter to specify a command that will be invoked by Failover Manager when performing tasks that require extended permissions. Use this option to include command options that might be specific to your system authentication.
Use the jgroups.loglevel and efm.loglevel parameters to specify the level of detail logged by Failover Manager. The default value is INFO. For more information about logging, see Section 6, Controlling Logging.
Use the jvm.options parameter to pass JVM-related configuration information. The default setting specifies the amount of memory that the Failover Manager agent will be allowed to use.

3 Installing and Configuring Failover Manager : 3.2 Configuring Failover Manager

Table of Contents Previous Next