Using Failover Manager for high availability v9
This document walks you through setting up the Failover manager for PEM Server with new installation and not with existing installation.
Postgres Enterprise Manager (PEM) helps database administrators, system architects, and performance analysts to administer, monitor, and tune Postgres database servers.
Failover Manager is a high-availability tool from EDB that enables a Postgres primary node to failover to a standby node during a software or hardware failure on the primary.
The examples in the following sections use these IP addresses:
- 172.16.161.200 - PEM Primary
- 172.16.161.201 - PEM Standby 1
- 172.16.161.202 - PEM Standby 2
- 172.16.161.203 - EFM Witness Node
- 172.16.161.245 - PEM VIP (used by agents and users to connect)
The following must use the VIP address:
- The PEM agent binding of the monitored database servers
- Accessing the PEM web client
- Accessing the webserver services
Initial product installation and configuration
Install the following on the primary and one or more standbys:
- EDB Postgres Advanced Server (backend database for PEM Server)
- PEM server
- EDB Failover Manager 4.1
Refer to the installation instructions in the product documentation using these links or see the instructions on the EDB repos website. Replace
USERNAME:PASSWORDwith your username and password in the instructions to access the EDB repositories.
Make sure that the database server is configured to use the scram-sha-256 authentication method, as the PEM server configuration script doesn't work with trust authentication.
You must install the
java-1.8.0-openjdkpackage to install EFM.
Configure the PEM server on the primary server as well as on all the standby servers with an initial configuration of type 1 (web services and database):
For more detail on configuration types see, Configuring the PEM server on Linux.
Add the following ports in the firewall on the primary and all the standby servers to allow the access:
8443for PEM Server (https)
5444for EPAS 13
7908for EFM Admin
Set up the primary node for streaming replication
Create the replication role:
Give the password of your choice.
Configure the following in the
For more information on configuring parameters for streaming replication, see the PostgreSQL documentation.
The configuration parameters might differ for different versions of the database server. You can email EDB Support at email@example.com for help with setting up these parameters.
Add the following entry in the host-based authentication (
/var/lib/edb/as13/data/pg_hba.conf) file to allow the replication user to connect from all the standbys:
You can change the cidr range of the IP address, if needed.
Modify the host-based authentication (
/var/lib/edb/as13/data/pg_hba.conf) file for the pem_user role to connect to all databases using the scram-sha-256 authentication method:
Restart the EPAS 13 server.
Set up the standby nodes for streaming replication
Stop the service for EPAS 13 on all the standby nodes:
This example uses the pg_basebackup utility to create the replicas of the PEM backend database server on the standby servers. When using pg_basebackup, you need to stop the existing database server and remove the existing data directories.
Remove the data directory of the database server on all the standby nodes:
.pgpassfile in the home directory of the enterprisedb user on all the standby nodes:
Take the backup of the primary node on each of the standby nodes using pg_basebackup:
backupcommand creates the
standby.signalfiles on the standby nodes. The
postgresql.auto.conffile has the following content:
postgresql.conffile on each of the standby nodes, edit the following parameter:
Start the EPAS 13 database server on each of the standby nodes:
Copy the following files from the primary node to the standby nodes at the same location, overwriting any existing files. Set the permissions on the files:
This code ensures that the webserver is configured on the standby and is disabled by default. Switchover by EFM enables the webserver.
Manually keep the certificates in sync on master and standbys whenever the certificates are updated.
- Run the
configure-selinux.shscript to configure the SELinux policy for PEM:
- Disable and stop HTTPD and PEM agent services if they're running on all replica nodes:
At this point, a PEM primary server and two standbys are ready to take over from the primary whenever needed.
Set up EFM to manage failover on all hosts
Prepare the primary node to support EFM:
- Create a database user efm to connect to the database servers.
- Grant the execute privileges on the functions related to WAL logs and the monitoring privileges to the user.
- Add entries in
pg_hba.confto allow the efm database user to connect to the database server from all nodes on all the hosts.
- Reload the configurations on all the database servers.
Create the scripts on each node to start/stop the PEM agent:
/etc/sudoers.d/efm-pem) on each node to allow the efm user to start/stop the pemagent:
efm.nodesfile on all nodes using the sample file (
/etc/edb/efm-4.1/efm.nodes.in), and give read-write access to the efm OS user:
Add the IP address and efm port of the primary node in the
/etc/edb/efm-4.1/efm.nodesfile on the standby nodes:
efm.propertiesfile on all the nodes using the sample file (
/etc/edb/efm-4.1/efm.properties.in). Grant read access to all the users:
Encrypt the efm user's password using the efm utility:
Edit the following parameters in the properties file:
Set the value of the
is.witnessconfiguration parameter on the witness node to
Enable and start the EFM service on the primary node:
Allow the standbys to join the cluster started on the primary node:
Enable and start the EFM service on the standby nodes and the EFM witness node:
Check the EFM cluster status from any node:
This status confirms that EFM is set up successfully and managing the failover for the PEM server.
In case of failover, any of the standbys are promoted as the primary node, and PEM agents connect to the new primary node. You can replace the failed primary node with a new standby using this procedure.
The current limitations include:
- Web console sessions for the users are lost during the switchover.
- Per-user settings set from the Preferences dialog box are lost, as they’re stored in local configuration files on the file system.
- Background processes, started by the Backup, Restore, and Maintenance dialogs boxes, and their logs aren't shared between the systems. They are lost during switchover.