When Failure is Not an Option – The New EDB Failover Manager

December 20, 2013

EnterpriseDB has just released a critical component for meeting stringent high 9’s enterprise-level High Availability demands with Postgres. As the newest member in our family of integrated tools for Postgres Plus Advanced Server and community PostgreSQL deployments, EDB Failover Manager addresses a longtime gap in Postgres HA solutions. Postgres users can now rely on EnterpriseDB as a single source of development, distribution and support for all the database components of their High Availability systems and not have to develop workarounds or integrate third-party tools.

Let’s examine how it works. EDB Failover Manager creates fault tolerant database clusters in multiple HA configurations, minimizing downtime and ensuring that data remains available in the event of a failure. With EDB Failover Manager, the cluster consists of a Master agent, Standby agent and Witness agent that reside on separate servers in a cloud or on a traditional network and communicate using the JGroups toolkit. A closer look at the agents involved reveal the processes that reside on the following hosts on a network:

  • Master - The Master node is the primary database server that is servicing database clients.
  • Standby - The Standby node is a streaming replication server being synchronized with the Master.
  • Witness - The Witness node confirms assertions of either the Master or the Standby in a failover scenario.

JGroups provides technology that allows EDB Failover Manager to create clusters whose member nodes can communicate with each other and detect node failures. For more information about JGroups, visit the official project site at http://www.jgroups.org

In this design, a witness node is a specialized node that acts as a safeguard against “split brain” scenarios. When the replica ‘thinks’ the master node/database has failed, it asks the witness to confirm. If the witness confirms, failover is triggered. If there’s no confirmation, then failover is not triggered and thus prevents situations where both the master and replica ‘think’ they are the master node. This is particularly useful if, for instance, the network connection between the master and replica is down or temporarily delayed. The replica's heartbeat check of the master may appear to have failed and lead to a false failover. Having the Witness node also check adds an additional layer of confirmation that a serious problem exists and failover operations should commence.

Your database needs to be online and available around the clock to serve your business, your customers and your partners. Hardware failures, network glitches, or a server crash can cost you money and opportunities. Systems need to remain accessible during planned maintenance as well. A cornerstone for ensuring ongoing data access during scheduled downtime or unexpected failures is a High Availability design.

End users must consider individual needs and tolerances for outages or data loss as well as sensitivities to cost or complexity when designing a system for High Availability. The good news is that with EDB Failover Manager, Postgres users now have a fully integrated and configurable HA tool based on industry proven and stable technology that is fully supported by EnterpriseDB. The technology, in fact, has been in use for several years in our Postgres Plus Cloud Database.

EnterpriseDB is staging a webinar on January 16, 2014 about EDB Failover Manager. For details and to register, please visit: http://www.enterprisedb.com/products/edb-failover-manager

Gary Carter is Director of Product Marketing at EnterpriseDB.

Share this