The Cluster Properties File v3
Each node in a Failover Manager cluster has a properties file (by default, named
efm.properties) that contains the properties of the individual node on which it resides. The Failover Manager installer creates a file template for the properties file named
efm.properties.in in the
After completing the Failover Manager installation, you must make a working copy of the template before modifying the file contents:
After copying the template file, change the owner of the file to
By default, Failover Manager expects the cluster properties file to be named
efm.properties. If you name the properties file something other than
efm.properties, you must modify the service script or unit file to instruct Failover Manager to use a different name.
After creating the cluster properties file, add (or modify) configuration parameter values as required. For detailed information about each property, see Specifying Cluster Properties.
The property files are owned by
root. The Failover Manager service script expects to find the files in the
/etc/edb/efm-3.10 directory. If you move the property file to another location, you must create a symbolic link that specifies the new location.
All user scripts referenced in the properties file will be invoked as the Failover Manager user.
You can use the properties listed in the cluster properties file to specify connection properties and behaviors for your Failover Manager cluster. Modifications to property settings will be applied when Failover Manager starts. If you modify a property value you must restart Failover Manager to apply the changes.
Property values are case-sensitive. Note that while Postgres uses quoted strings in parameter values, Failover Manager does not allow quoted strings in property values. For example, while you might specify an IP address in a Postgres configuration parameter as:
Failover Manager requires that the value not be enclosed in quotes:
Use the properties in the
efm.properties file to specify connection, administrative, and operational details for Failover Manager.
Legends: In the following table:
A: Required on Primary or Standby node
W: Required on Witness node
|Property Name||A||W||Default Value||Comments|
|db.user||Y||Y||Username for the database.|
|db.password.encrypted||Y||Y||Password encrypted using 'efm encrypt'.|
|db.port||Y||Y||This value must be same for all the agents.|
|db.service.owner||Y||Owner of $PGDATA dir for db.database.|
|db.service.name||Required if running the database as a service.|
|db.bin||Y||Directory containing the pg_controldata/pg_ctl commands such as '/usr/edb/as12/bin'.|
|db.data.dir||Y||Same as the output of query 'show data_directory;' Name changed from db.recovery.dir in EFM 3.9.|
|db.config.dir||Same as the output of query 'show config_file;'. Should be specified if it is not same as db.data.dir. Text changed in EFM Version 3.8 and 3.9.|
|jdbc.sslmode||Y||Y||disable||See the note.|
|user.email||This value must be same for all the agents; can be left blank if using a notification script.|
|from.email||efm@localhost||Leave blank to use the default efm@localhost.|
|notification.level||Y||Y||INFO||See the list of notifications.|
|script.notification||Required if user.email property is not used; both parameters can be used together.|
|external.address||Example: <ip_address/hostname> Available in EFM 3.10 and later.|
|admin.port||Y||Y||7809||Modify if the default port is already in use.|
|node.timeout||Y||Y||50||This value must be same for all the agents.|
|update.physical.slots.period||Y||0||Available in EFM 3.10 and later.|
|ping.server.ip||Y||Y||188.8.131.52||Name changed from pingServerIp in EFM 3.9.|
|ping.server.command||Y||Y||/bin/ping -q -c3 -w5||Name changed from pingServerCommand in EFM 3.9.|
|auto.reconfigure||Y||true||This value must be same for all the agents.|
|use.replay.tiebreaker||Y||Y||true||This value must be same for all the agents. Available in EFM 3.9 and later.|
|application.name||Set to replace the application_name portion of the primary_conninfo entry with this property value before starting the original primary database as a standby.|
|restore.command||Example: restore.command=scp <db_service_owner>@%h: <archive_path>/%f %p|
|reconfigure.num.sync||Y||false||Available in EFM 3.9 and later.|
|reconfigure.sync.primary||Y||false||Text changed in EFM 3.9.|
|minimum.standbys||Y||Y||0||This value must be same for all the nodes.|
|virtual.ip||(see virtual.ip.single)||Leave blank if you do not specify a VIP. Name changed from virtualIp in EFM 3.9.|
|virtual.ip.interface||Required if you specify a VIP. Name changed from virtualIp.interface in EFM 3.9.|
|virtual.ip.prefix||Required if you specify a VIP. Name changed from virtualIp.prefix in EFM 3.9.|
|virtual.ip.single||Y||Y||Yes||This value must be same for all the nodes. Name changed from virtualIp.single in EFM 3.9.|
|script.load.balancer.attach||Example: script.load.balancer.attach= /<path>/<attach_script> %h %t|
|script.load.balancer.detach||Example: script.load.balancer.detach= /<path>/<detach_script> %h %t|
|script.fence||Example: script.fence= /<path>/<script_name> %p %f|
|script.post.promotion||Example: script.post.promotion= /<path>/<script_name> %f %p|
|script.resumed||Example: script.resumed= /<path>/<script_name>|
|script.db.failure||Example: script.db.failure= /<path>/<script_name>|
|script.primary.isolated||Example: script.primary.isolated= /<path>/<script_name>|
|script.remote.pre.promotion||Example: script.remote.pre.promotion= /<path>/<script_name> %p|
|script.remote.post.promotion||Example: script.remote.post.promotion= /<path>/<script_name> %p|
|script.custom.monitor||Example: script.custom.monitor= /<path>/<script_name>|
|custom.monitor.interval||Required if a custom monitoring script is specified|
|custom.monitor.timeout||Required if a custom monitoring script is specified|
|custom.monitor.safe.mode||Required if a custom monitoring script is specified|
|sudo.user.command||Y||Y||sudo -u %u|
|lock.dir||If not specified, defaults to '/var/lock/efm-<version>'|
|log.dir||If not specified, defaults to '/var/log/efm-<version>'|
Use the following properties to specify connection details for the Failover Manager cluster:
db.user specified must have sufficient privileges to invoke selected PostgreSQL commands on behalf of Failover Manager. For more information, please see Prerequisites.
For information about encrypting the password for the database user, see Encrypting Your Database Password.
db.service.owner property to specify the name of the operating system user that owns the cluster that is being managed by Failover Manager. This property is not required on a dedicated witness node.
Specify the name of the database service in the
db.service.name property if you use the service or systemctl command when starting or stopping the service.
You should use the same service control mechanism (pg_ctl, service, or systemctl) each time you start or stop the database service. If you use the
pg_ctl program to control the service, specify the location of the
pg_ctl program in the
db.data.dir property to specify the location to which a recovery file will be written on the Primary node of the cluster during promotion. This property is required on primary and standby nodes; it is not required on a dedicated witness node.
db.config.dir property to specify the location of database configuration files if they are not stored in the same directory as the
standby.signal file. This should be the value specified by the
config_file parameter directory of your Advanced Server or PostgreSQL installation. This value will be used as the location of the Postgres
data directory when stopping, starting, or restarting the database.
For more information about database configuration files, visit the PostgreSQL website.
jdbc.sslmode property to instruct Failover Manager to use SSL connections; by default, SSL is disabled.
If you set the value of
verify-ca and you want to use Java trust store for certificate validation, you need to set the following value:
For information about configuring and using SSL, please see:
user.email property to specify an email address (or multiple email addresses) that will receive any notifications sent by Failover Manager.
from.email property specifies the value that will be used as the sender's address on any email notifications from Failover Manager. You can:
from.emailblank to use the default value (
- specify a custom value for the email address.
- specify a custom email address, using the
%hplaceholder to represent the name of the node host (e.g., example@%h). The placeholder will be replaced with the name of the host as returned by the Linux hostname utility.
For more information about notifications, see Notifications.
notification.level property to specify the minimum severity level at which Failover Manager will send user notifications or when a notification script is called. For a complete list of notifications, please see Notifications.
script.notification property to specify the path to a user-supplied script that acts as a notification service; the script will be passed a message subject and a message body. The script will be invoked each time Failover Manager generates a user notification.
bind.address property specifies the IP address and port number of the agent on the current node of the Failover Manager cluster.
external.address property to specify the IP address or hostname that should be used for communication with all other Failover Manager agents in a NAT environment.
admin.port property to specify a port on which Failover Manager listens for administrative commands.
is.witness property to true to indicate that the current node is a witness node. If is.witness is true, the local agent will not check to see if a local database is running.
pg_is_in_recovery() function is a boolean function that reports the recovery state of a database. The function returns
true if the database is in recovery, or false if the database is not in recovery. When an agent starts, it connects to the local database and invokes the
pg_is_in_recovery() function. If the server responds true, the agent assumes the role of standby; if the server responds false, the agent assumes the role of primary. If there is no local database, the agent will assume an idle state.
true, Failover Manager will not check the recovery state.
The following properties specify properties that apply to the local server:
local.periodproperty specifies how many seconds between attempts to contact the database server.
local.timeoutproperty specifies how long an agent will wait for a positive response from the local database server.
local.timeout.finalproperty specifies how long an agent will wait after the above-mentioned previous checks have failed to contact the database server on the current node. If a response is not received from the database within the number of seconds specified by the
local.timeout.finalproperty, the database is assumed to have failed.
For example, given the default values of these properties, a check of the local database happens once every 10 seconds. If an attempt to contact the local database does not come back positive within 60 seconds, Failover Manager makes a final attempt to contact the database. If a response is not received within 10 seconds, Failover Manager declares database failure and notifies the administrator listed in the user.email property. These properties are not required on a dedicated witness node.
If necessary, you should modify these values to suit your business model.
remote.timeout property to specify how many seconds an agent waits for a response from a remote database server (i.e., how long a standby agent waits to verify that the primary database is actually down before performing failover). The
remote.timeout property value specifies a timeout value for agent-to-agent communication; other timeout properties in the cluster properties file specify values for agent-to-database communication.
node.timeout property to specify the number of seconds that an agent will wait for a response from a node when determining if a node has failed.
stop.isolated.primary property to instruct Failover Manager to shut down the database if a primary agent detects that it is isolated. When true (the default), Failover Manager will stop the database before invoking the script specified in the
stop.failed.primary property to instruct Failover Manager to attempt to shut down a primary database if it can not reach the database. If
true, Failover Manager will run the script specified in the
script.db.failure property after attempting to shut down the database.
primary.shutdown.as.failure parameter to indicate that any shutdown of the Failover Manager agent on the primary node should be treated as a failure. If this parameter is set to
true and the primary agent stops (for any reason), the cluster will attempt to confirm if the database on the primary node is running:
- If the database is reached, a notification will be sent informing you of the agent status.
- If the database is not reached, a failover will occur.
primary.shutdown.as.failure property is meant to catch user error, rather than failures, such as the accidental shutdown of a primary node. The proper shutdown of a node can appear to the rest of the cluster like a user has stopped the primary Failover Manager agent (for example to perform maintenance on the primary database). If you set the
primary.shutdown.as.failure property to
true, care must be taken when performing maintenance.
To perform maintenance on the primary database when
true, you should stop the primary agent and wait to receive a notification that the primary agent has failed but the database is still running. Then it is safe to stop the primary database. Alternatively, you can use the
efm stop-cluster command to stop all of the agents without failure checks being performed.
update.physical.slots.period property to define the slot advance frequency for database version 12 and above. When
update.physical.slots.period is set to a non-zero value, the primary agent will read the current
restart_lsn of the physical replication slots after every
update.physical.slots.period seconds, and send this information with its
primary_slot_name (If it is set in the postgresql.conf file) to the standbys. If physical slots do not already exist, setting this parameter to a non-zero value will create the slots and then update the
restart_lsn parameter for these slots. A non-promotable standby will not create new slots but will update them if they exist.
ping.server.ip property to specify the IP address of a server that Failover Manager can use to confirm that network connectivity is not a problem.
ping.server.command property to specify the command used to test network connectivity.
auto.allow.hosts property to instruct the server to use the addresses specified in the .nodes file of the first node started to update the allowed host list. Enabling this property (setting
auto.allow.hosts to true) can simplify cluster start-up.
stable.nodes.file property to instruct the server to not rewrite the nodes file when a node joins or leaves the cluster. This property is most useful in clusters with unchanging IP addresses.
db.reuse.connection.count property allows the administrator to specify the number of times Failover Manager reuses the same database connection to check the database health. The default value is 0, indicating that Failover Manager will create a fresh connection each time. This property is not required on a dedicated witness node.
auto.failover property enables automatic failover. By default, auto.failover is set to true.
auto.reconfigure property to instruct Failover Manager to enable or disable automatic reconfiguration of remaining Standby servers after the primary standby is promoted to Primary. Set the property to
true to enable automatic reconfiguration (the default) or
false to disable automatic reconfiguration. This property is not required on a dedicated witness node. If you are using Advanced Server or PostgreSQL version 11 or earlier, the
recovery.conf file will be backed up during the reconfiguration process.
primary_conninfo is a space-delimited list of keyword=value pairs.
promotable property to indicate that a node should not be promoted. The
promotable property is ignored when a primary agent is started. This simplifies switching back to the original primary after a switchover or failover. To override the setting, use the efm set-priority command at runtime; for more information about the efm set-priority command, see Using the efm Utility.
If the same amount of data has been written to more than one standby node, and a failover occurs, the
use.replay.tiebreaker value will determine how Failover Manager selects a replacement primary. Set the
use.replay.tiebreaker property to
true to instruct Failover Manager to failover to the node that will come out of recovery faster, as determined by the log sequence number. To ignore the log sequence number and promote a node based on user preference, set
You can use the
application.name property to provide the name of an application that will be copied to the
primary_conninfo parameter before restarting an old primary node as a standby.
You should set the
application.name property on the primary and any promotable standby; in the event of a failover/switchover, the primary node could potentially become a standby node again.
restore.command property to instruct Failover Manager to update the
restore_command when a new primary is promoted.
%h represents the address of the new primary; Failover Manager will replace
%h with the address of the new primary.
%p are placeholders used by the server. If the property is left blank, Failover Manager will not update the
restore_command values on the standbys after a promotion.
See the PostgreSQL documentation for more information about using a restore_command.
The database parameter
synchronous_standby_names on the primary node specifies the names and count of the synchronous standby servers that will confirm receipt of data, to ensure that the primary nodes can accept write transactions. When
reconfigure.num.sync property is set to true, Failover Manager will reduce the number of synchronous standby servers and reload the configuration of the primary node to reflect the current value.
If you are using the
reconfigure.num.sync property, ensure that the
wal_sender_timeout in the primary database is set to at least ten seconds less than the
reconfigure.sync.primary property to
true to take the primary database out of synchronous replication mode if the number of standby nodes drops below the level required. Set
false to send a notification if the standby count drops, but not interrupt synchronous replication.
If you are using the
reconfigure.sync.primary property, ensure that the
wal_sender_timeout in the primary database is set to at least ten seconds less than the
minimum.standbys property to specify the minimum number of standby nodes that will be retained on a cluster; if the standby count drops to the specified minimum, a replica node will not be promoted in the event of a failure of the primary node.
recovery.check.period property to specify the number of seconds that Failover Manager will wait before checks to see if a database is out of recovery.
restart.connection.timeout property to specify the number of seconds that Failover Manager will attempt to connect to a newly reconfigured primary or standby node while the database on that node prepares to accept connections.
auto.resume.period property to specify the number of seconds (after a monitored database fails and an agent has assumed an idle state, or when starting in IDLE mode) during which an agent will attempt to resume monitoring that database.
Failover Manager provides support for clusters that use a virtual IP. If your cluster uses a virtual IP, provide the host name or IP address in the
virtual.ip property; specify the corresponding prefix in the
virtual.ip.prefix property. If
virtual.ip is left blank, virtual IP support is disabled.
virtual.ip.interface property to provide the network interface used by the VIP.
The specified virtual IP address is assigned only to the primary node of the cluster. If you specify
virtual.ip.single=true, the same VIP address will be used on the new primary in the event of a failover. Specify a value of false to provide a unique IP address for each node of the cluster.
For information about using a virtual IP address, see Using Failover Manager with Virtual IP Addresses.
If a primary agent is started and the node does not currently have the VIP, the EFM agent will acquire it. Stopping a primary agent does not drop the VIP from the node.
check.vip.before.promotion property to false to indicate that Failover Manager will not check to see if a VIP is in use before assigning it to a a new primary in the event of a failure. Note that this could result in multiple nodes broadcasting on the same VIP address; unless the primary node is isolated or can be shut down via another process, you should set this property to true.
Use the following properties to provide paths to scripts that reconfigure your load balancer in the event of a switchover or primary failure scenario. The scripts will also be invoked in the event of a standby failure. If you are using these properties, they should be provided on every node of the cluster (primary, standby, and witness) to ensure that if a database node fails, another node will call the detach script with the failed node's address.
You do not need to set the below properties if you are using Pgpool as Load Balancer solution and have set the Pgpool integration properties.
Provide a script name after the
script.load.balancer.attach property to identify a script that will be invoked when a node should be attached to the load balancer. Use the
script.load.balancer.detach property to specify the name of a script that will be invoked when a node should be detached from the load balancer. Include the
%h placeholder to represent the IP address of the node that is being attached or removed from the cluster. Include the
%t placeholder to instruct Failover Manager to include an p (for a primary node) or an s (for a standby node) in the string.
script.fence specifies the path to an optional user-supplied script that will be invoked during the promotion of a standby node to primary node.
script.post.promotion property to specify the path to an optional user-supplied script that will be invoked after a standby node has been promoted to primary.
script.resumed property to specify an optional path to a user-supplied script that will be invoked when an agent resumes monitoring of a database.
script.db.failure property to specify the complete path to an optional user-supplied script that Failover Manager will invoke if an agent detects that the database that it monitors has failed.
script.primary.isolated property to specify the complete path to an optional user-supplied script that Failover Manager will invoke if the agent monitoring the primary database detects that the primary is isolated from the majority of the Failover Manager cluster. This script is called immediately after the VIP is released (if a VIP is in use).
script.remote.pre.promotion property to specify the path and name of a script that will be invoked on any agent nodes not involved in the promotion when a node is about to promote its database to primary.
Include the %p placeholder to identify the address of the new primary node.
script.remote.post.promotion property to specify the path and name of a script that will be invoked on any non-primary nodes after a promotion occurs.
Include the %p placeholder to identify the address of the new primary node.
script.custom.monitor property to provide the name and location of an optional script that will be invoked on regular intervals (specified in seconds by the
custom.monitor.timeout to specify the maximum time that the script will be allowed to run; if script execution does not complete within the time specified, Failover Manager will send a notification.
true to instruct Failover Manager to report non-zero exit codes from the script, but not promote a standby as a result of an exit code.
sudo.command property to specify a command that will be invoked by Failover Manager when performing tasks that require extended permissions. Use this option to include command options that might be specific to your system authentication.
sudo.user.command property to specify a command that will be invoked by Failover Manager when executing commands that will be performed by the database owner.
lock.dir property to specify an alternate location for the Failover Manager lock file; the file prevents Failover Manager from starting multiple (potentially orphaned) agents for a single cluster on the node.
log.dir property to specify the location to which agent log files will be written; Failover Manager will attempt to create the directory if the directory does not exist.
After enabling the UDP or TCP protocol on a Failover Manager host, you can enable logging to syslog. Use the
syslog.protocol parameter to specify the protocol type (UDP or TCP) and the
syslog.port parameter to specify the listener port of the syslog host. The
syslog.facility value may be used as an identifier for the process that created the entry; the value must be between LOCAL0 and LOCAL7.
syslog.enabled properties to specify the type of logging that you wish to implement. Set
true to enable logging to a file; enable the UDP protocol or TCP protocol and set
true to enable logging to syslog. You can enable logging to both a file and syslog.
For more information about configuring syslog logging, see Enabling syslog Log File Entries.
efm.loglevel parameters to specify the level of detail logged by Failover Manager. The default value is INFO. For more information about logging, see Controlling Logging.
jvm.options property to pass JVM-related configuration information. The default setting specifies the amount of memory that the Failover Manager agent will be allowed to use.
- On this page
- Specifying Cluster Properties