3.2 Configuring Failover Manager
3 Installing and Configuring Failover Manager : 3.2 Configuring Failover Manager
The efm.properties file contains the properties of the individual node on which it resides, while the efm.nodes file contains a list of the current Failover Manager cluster members.By default, the installer places the files in the /etc/efm-2.0 directory.The Failover Manager installer creates a file template for the cluster properties file named efm.properties.in in the /etc/efm-2.0 directory. After completing the Failover Manager installation, you must make a working copy of the template before modifying the file contents.The following command copies the efm.properties.in file, creating a properties file named efm.properties:# cp /etc/efm-2.0/efm.properties.in /etc/efm-2.0/efm.propertiesPlease note: By default, Failover Manager expects the cluster properties file to be named efm.properties. If you name the properties file something other than efm.properties, you must modify the service script to instruct Failover Manager to use a different name.After creating the cluster properties file, add (or modify) configuration parameter values as required. For detailed information about each parameter, see , Specifying Cluster Properties.The property files are owned by root. The Failover Manager service script expects to find the files in the /etc/efm-2.0 directory. If you move the property file to another location, you must create a symbolic link that specifies the new location.Note that you must use the efm encrypt command to encrypt the value supplied in the db.password.encrypted parameter. For more information about encrypting a password, see .188.8.131.52 Specifying Cluster PropertiesYou can use the parameters listed in the cluster properties file to specify connection properties and behaviors for your Failover Manager cluster. Modifications to configuration parameter settings will be applied when Failover Manager starts. If you modify a parameter value (with the exception of the efm.license parameter) you must restart Failover Manager to apply the changes.Property values are case-sensitive. Note that while Postgres uses quoted strings in parameter values, Failover Manager does not allow quoted strings in the parameter values. For example, while you might specify an IP address in a PostgreSQL configuration parameter as:Failover Manager requires that the value not be enclosed in quotes:Use the parameters that follow to specify connection, administrative, and operational details for Failover Manager.Use the efm.license parameter to provide the Failover Manager product key:The trial period is 60 days. When there are five (or fewer) days left in the trial period, Failover Manager will send an email warning you that it is time to provide a valid license number. If you have not provided a product key before the trial period expires, all Failover Manager agents will exit.You do not need to restart the agents after adding the product key to the properties file. Every six hours the Failover Manager agent will attempt to locate and validate the product key.The auto.failover parameter enables automatic failover. By default, auto.failover is set to true.# Whether or not failover will happen automatically when the master
# fails. Set to false if you want to receive the failover notifications
# but not have EFM actually perform the failover steps.
# The value of this property must be the same across all agents.Use the auto.reconfigure parameter to instruct Failover Manager to enable or disable automatic reconfiguration of remaining Standby servers after the primary standby is promoted to Master. Set the parameter to true to enable automatic reconfiguration (the default) or false to disable automatic reconfiguration. This property is not required on a dedicated witness node.# After a standby is promoted, failover manager will attempt to
# update the remaining standbys to use the new master. Failover
# manager will back up recovery.conf, change the host
# parameter of the primary_conninfo entry, and restart the
# database. The restart command is contained in the efm_functions
# file; default is:
# "pg_ctl restart -m fast -w -t <timeout> -D <directory>"
# where the timeout is the local.timeout property value and the
# directory is specified by db.recovery.conf.dir. To turn off
# automatic reconfiguration, set this property to false.Please note: primary_conninfo is a space-delimited list of keyword=value pairs.Please note: If you are using replication slots to manage your WAL segments, automatic reconfiguration is not supported; you should set auto.reconfigure to false. For more information, see Section 2.2.Use the following parameters to specify connection properties for each node of the Failover Manager cluster:# The value for the password property should be the output from
# 'efm encrypt' -- do not include clear text password here. To
# prevent accidental sharing of passwords among clusters, the
# cluster name is incorporated into the encrypted password. If
# you change the cluster name (the name of this file), you must
# encrypt the password again with the new name.
# The db.port property must be the same for all nodes.For information about encrypting the password for the database user, see , Encrypting Your Database Password.The db.reuse.connection.count parameter allows the administrator to specify the number of times Failover Manager reuses the same database connection to check the database health. The default value is 0, indicating that Failover Manager will create a fresh connection each time. This property is not required on a dedicated witness node.# This property controls how many times a database connection is
# reused before creating a new one. If set to zero, a new
# connection will be created every time an agent pings its local
# database.Use the admin.port parameter to specify the port on which Failover Manager listens for administrative commands.# This property controls the port binding of the administration
# server which is used for some commands (ie cluster-status).The local.period parameter specifies how many seconds between attempts to contact the database server.
The local.timeout parameter specifies how long an agent will wait for a response from the local database server.
The local.timeout.final parameter specifies how long an agent will wait after the final attempt to contact the database server on the current node. If a response is not received from the database within the number of seconds specified by the local.timeout.final parameter, the database is assumed to have failed.For example, given the default values of these parameters, a check of the local database happens once every 10 seconds. If an attempt to contact the local database does not come back positive within 60 seconds, Failover Manager makes a final attempt to contact the database. If a response is not received within 10 seconds, Failover Manager declares database failure and notifies the administrator listed in the user.email parameter. These properties are not required on a dedicated witness node.# These properties apply to the connection(s) EFM uses to monitor
# the local database. Every 'local.period' seconds, a database
# check is made in a background thread. If the main monitoring
# thread does not see that any checks were successful in
# 'local.timeout' seconds, then the main thread makes a final
# check with a timeout value specified by the
# 'local.timeout.final' value. All values are in seconds.
# Whether EFM uses single or multiple connections for database#
# checks is controlled by the 'db.reuse.connection.count'
local.timeout.final=10Use the remote.timeout parameter to specify how many seconds an agent waits for a response from a remote database server (i.e., how long a standby agent waits to verify that the master database is actually down before performing failover).# Timeout for a call to check if a remote database is responsive.
# For example, this is how long a standby would wait for a
# DB ping request from itself and the witness to the master DB
# before performing failover.The jgroups.max.tries parameter specifies the number of consecutive times Failover Manager attempts to contact a node before the node is assumed to be down. jgroups.timeout specifies the number of milliseconds before the connection attempts time out.# These properties apply to the jgroups connection between the
# nodes. Description copied from jgroups:
# Max tries: Number of times to send an are-you-alive message.
# Timeout (in ms): Timeout to suspect a node P if neither a
# heartbeat nor data were received from P.
# The value of these properties must be the same across all
# agents.Use the user.email parameter to specify the email address of a system administrator.# Email address of the user for notifications. The value of this
# property must be the same across all agents.The bind.address parameter specifies the IP address and port number of the agent on the current node of the Failover Manager cluster.# This property specifies the ip address and port that jgroups
# will bind to on this node. The value is of the form
# Note that the port specified here is used for communicating
# with other nodes, and is not the same as the admin.port above,
# used only to communicate with the local agent to send control
# signals.Set the is.witness parameter to true to indicate that the current node is a witness node. If is.witness is true, the local agent will not check to see if a local database is running.# Specifies whether or not this is a witness node. Witness nodes
# do not have local databases running.The Postgres pg_is_in_recovery() function is a boolean function that reports the recovery state of a database. The function returns true if the database is in recovery, or false if the database is not in recovery. When an agent starts, it connects to the local database and invokes the pg_is_in_recovery() function. If the server responds true, the agent assumes the role of standby; if the server responds false, the agent assumes the role of master. If is.witness is true, Failover Manager will not check the recovery state.Use the db.service.owner parameter to specify the name of the operating system user that owns the cluster that is being managed by Failover Manager. This property is not required on a dedicated witness node.# This property tells EFM which OS user owns the $PGDATA dir for
# the 'db.database'. By default, the owner is either "postgres"
# for PostgreSQL or "enterprisedb" for Postgres Plus Advanced
# Server. However, if you have configured your db to run as a
# different user, you will need to copy the /etc/sudoers.d/efm-XX
# conf file to grant the necessary permissions to your db owner.
# This username must have write permission to the
# 'db.recovery.conf.dir' specified below.Use the db.recovery.conf.dir parameter to specify the location to which a recovery file will be written on the Master node of the cluster. This property is not required on a dedicated witness node.# Specify the location of the db recovery.conf file on the node.
# On a standby node, the trigger file location is read from the
# file in this directory. After a failover, the recovery.conf
# files on remaining standbys are changed to point to the new
# master db (a copy of the original is made first). On a master
# node, a recovery.conf file will be written during failover and
# promotion to ensure that the master node can not be restarted
# as the master database.Use the db.bin parameter to specify the location of the pg_ctl command for the local database server. This property is not required on a dedicated witness node.# Specify the directory containing the pg_ctl command, for
# instance: /usr/pgsql-9.3/bin. The pg_ctl command is used to
# restart standby databases after a failover so that they are
# streaming from the new master node.The virtualIp parameter specifies virtual IP address information for the Failover Manager cluster. Use the virtualIp.interface parameter to specify an alias for your network adaptor (for example, eth0:1 specifies an alias for the adaptor, eth0). You might create multiple aliases for each adaptor on a given host; for more information about running multiple agents on a single node, please see Section 4.9. The virtualIp.netmask parameter specifies which bits in the virtual IP address refer to the network address (as opposed to the host address).# This is the IP and netmask that will be remapped during fail
# over. If you do not use VIPs as part of your failover
# solution, then leave these properties blank to disable EFM's
# support for VIP processing (assigning, releasing, testing
# reachability, etc).
# If you enable VIP, then all three properties are required.
# The address and netmask must be the same across all agents.
# The 'interface' value must contain the secondary virtual ip
# id (ie ":1", etc).Use the pingServer parameter to specify the IP address of a server that Failover Manager can use to confirm that network connectivity is not a problem.# This is the address of a well-known server that EFM can ping
# in an effort to determine network reachability issues. It
# might be the IP address of a nameserver within your corporate
# firewall or another server that *should* always be reachable
# via a 'ping' command from each of the EFM nodes.
# There are many reasons why this node might not be considered
# reachable: firewalls might be blocking the request, ICMP might
# be filtered out, etc.
# Do not use the IP address of any node in the EFM cluster
# (master, standby, or witness because this ping server is meant
# to provide an additional layer of information should the EFM
# nodes lose sight of each other.
# The installation default is Google's DNS server.Use the pingServerCommand parameter to specify the command used to test network connectivity.# This command will be used to test the reachability of certain
# Do not include an IP address or hostname in on the end of this
# command - it will be added dynamically at runtime with the
# values contained in 'virtualIp' and 'pingServer'.
# Make sure this command returns reasonably quickly - test it
# from a shell command line first to make sure it works properly.script.fence specifies an optional path to a user-supplied script that will be invoked during the promotion of a standby node to master node.# absolute path to fencing script run during promotion
# This is an optional user-supplied script that will be run
# during failover on the standby database node. If left blank,
# no action will be taken. If specified, EFM will execute this
# script before promoting the standby. The script is run as the
# efm user.
# NOTE: FAILOVER WILL NOT OCCUR IF THIS SCRIPT RETURNS A NON-ZERO EXIT CODE.Please note that the fencing script runs as the efm user; you must ensure that the efm user has sufficient privileges to invoke any commands included in the fencing script. For more information about Failover Manager permissions, please see .Use the script.post.promotion parameter to specify an optional path to a user-supplied script that will be invoked after a standby node has been promoted to master.# Absolute path to fencing script run after promotion
# This is an optional user-supplied script that will be run after
# failover on the standby node after it has been promoted and
# is no longer in recovery. The exit code from this script has
# no effect on failover manager, but will be included in a
# notification sent after the script executes. The script is run
# as the efm user.Use the jgroups.loglevel and efm.loglevel parameters to specify the level of detail logged by Failover Manager. The default value is INFO. For more information about logging, see Section 6, Controlling Logging.# Logging levels for JGroups and EFM.
# Valid values are: FINEST, FINER, FINE, CONFIG, INFO, WARNING,
# Default value: INFO
# It is not necessary to increase these values unless debugging a
# specific issue. If nodes are not discovering each other at
# startup, increasing the jgroups level to FINER will show
# information about the TCP connection attempts that may help
# diagnose the connection failures.Failover Manager requires you to encrypt your database password before including it in the cluster properties file. Use the efm utility (located in the /usr/efm-2.0/bin directory) to encrypt the password; open a command line, and enter the command:# efm encrypt cluster_nameWhere cluster_name specifies the name of the Failover Manager cluster.The Failover Manager service will prompt you to enter the database password twice before generating an encrypted password for you to place in your cluster property file. When the utility shares the encrypted password, copy and paste the encrypted password into the cluster property files.The following example demonstrates using the encrypt utility to encrypt a password for the acctg cluster:# efm encrypt acctg
This utility will generate an encrypted password for you to place in your EFM cluster property file.
Please enter the password and hit enter:
Please enter the password again to confirm:
The encrypted password is: 835fb18954f198e94fd3d6f4b070350b
Please paste this into your cluster properties file.
db.password.encrypted=835fb18954f198e94fd3d6f4b070350bIf you receive this message when starting the Failover Manager service on RHEL 6.x or CentOS 6.x, please see the startup log (located in /var/log/efm-2.0/startup-efm.log) for more information.If you are using RHEL 7.x or CentOS 7.x, startup information is available via the following command:
3 Installing and Configuring Failover Manager : 3.2 Configuring Failover Manager