Exploring failover handling with PGD v5
With a high-availability cluster, the ability to failover is crucial to the overall resilience of the cluster. When the lead data nodes stops working for whatever reason, applications need to be able to continue working with the database with little or no interruption. For PGD, that means directing applications to the new lead data node, which takes over automatically. This is where PGD Proxy is useful. It works with the cluster and directs traffic to the lead data node automatically.
In this exercise, you'll create an application that sends data to the database regularly. Then you'll first softly switch lead data node by requesting a change through the PGD CLI. And then you'll forcibly shut down a database instance and see how PGD handles that.
At the end of each quick starts, you'll have a cluster with four nodes and these roles:
|Host name||Host role|
|kaboom||PGD data node and pgd-proxy co-host|
|kaftan||PGD data node and pgd-proxy co-host|
|kaolin||PGD data node and pgd-proxy co-host|
|kapok||Barman backup node|
You'll use these hostnames throughout this exercise.
You'll be using
xpanes, a utility that allows you to quickly create multiple terminal sessions that you can easily switch between. It isn't installed by default, so you'll have to install it. For this exercise, you'll be launching xpanes from the system where you ran tpaexec to configure your quick-start cluster.
If the system is running Ubuntu, run this:
These are the installation instructions from the xpanes repository. If you aren't on Ubuntu, the repository also contains installation instructions for other systems.
With xpanes installed, you can create an SSH session with all four servers by running:
After running these commands, there are four panes. The four panes are connected to kaboom, kaolin, kaftan, and kapok and logged in as the root user on each. You need this privilege so you can easily stop and start services later in the exercise.
Press Control-b followed by q to briefly display the numeric values for each pane.
To switch the focus between the panes, you can use Control-b and the cursor keys to navigate between them. Or you can use Control-b followed by q and the number of the pane you want to focus on. We'll show both ways.
Use Control-b ↓ Control-b → or Control-b q 3 to move the focus to the bottom-right pane, which is the kapok host. This server is responsible for performing backups. You'll use this as the base of operations for your demo application. You can use Barman credentials to connect to the database servers and proxies:
This code connects to the proxy on the kaboom host, which also runs a Postgres instance as part of the cluster.
The next step is to create the table your application will write to:
This code first drop the
ping table. Then it recreates the
ping table with an id primary key and two text fields for a node and timestamp. The table should now be ready. To check, use Control-b ← Control-b ↑ or Control-b q 0 to move to the top left pane, which puts you on the kaboom server. In this pane, become the enterprisedb user so you can easily connect to the database:
You can now connect to the local database by running:
This command connects you directly to the local database instance on kaboom. Use
\dt to view the available tables:
\d ping shows that the DDL to create ping is on the kaboom server:
If you want to be sure that this table is replicated, you can connect to another node in the cluster and look. The
\c command in psql lets you connect to another server. To connect to the kaftan node, run:
You'll see a login message similar to this:
\d ping, and you'll see the same results on the kaftan node.
To reconnect to the kaboom node, run:
Next, you want to monitor the activity of the ping table. Enter this SQL to display the 10 most recent entries:
To run this command more than once, use the
\watch command in the shell, which executes the last query at regular intervals. To update every second, enter:
So far, there's nothing to see. You'll add activity soon.
Return to the Barman host kapok by using Control-b ↓ Control-b → or Control-b q 3.
This session is still logged into the psql session. Since you next want to run a shell script, you need to exit psql. Press Control-d.
The shell prompt now reads:
If it says
sudo -iu barman to become the Barman user again.
The application you'll create is simple. It gets the node to write to and a timestamp for the ping. Then, as quickly as it can, it writes a new ping to the ping table.
In the shell, enter:
In a more readable form, that is:
In a constant loop, you call the psql command, telling it to connect to any of the three proxies as hosts, giving the proxy port and selecting the bdrdb database. You also pass a command that inserts two values into the ping table. One of the values comes from
bdr.local_node_summary, which contains the name of the node you're actually connected to. The other value is the current time.
Once the loop is running, new entries appear in the table. You'll see them in the top-left pane where you set up the monitor.
You can now start testing failover.
For this part of the process, switch to the host kaftan, which is in the lower-left corner. Use Control-b ← or Control-b q 2 to switch focus to it.
To gain appropriate privileges to run
pgd, at the PGD command line interface, run:
To see the state of the cluster, run:
You'll see output like this:
The global group
democluster includes all the subgroups. The
dc1_subgroup is the data cluster you're working with. That group name value is derived from the location given in the quick start when you configured this cluster. Each location gets its own subgroup so you can manage it independently of other locations, or clusters.
If you skip to the right of the table, you can see that the current write leader for the group—the server where all the proxies send their updates—is kaboom.
switchover command to the cluster group to change leader. Run this command:
The node name is the host name for another data node in the dc1_subgroup group.
You'll see one of two responses. When you ran the
show-groups command, if it showed kaolin as the write leader, you'll see:
This means that kaolin was already elected write leader, so switching has no effect. For this exercise, retry the switchover to another host, substituting kaboom or kaftan as the node name.
When you select a host that wasn't the current write leader, you'll see the other response:
If you look in the top-left pane, you'll see the inserts from the script switching and being written to the node you just switched to.
Observe the id number
Notice that the id number being generated is from a completely different range of values, too. That's because the system transparently made the sequence generating the ID a global sequence. Read more about global sequences and how they work in Sequences.
You might also notice an error in the lower-right pane, as an inflight update is canceled by the switch. The script then continues writing.
Being able to switch leader is useful for planned maintenance; you tell the cluster to change configuration. What if unexpected changes happen? You'll create that scenario now.
In the lower-left pane, set the leader to kaolin.
Then change focus to the top-right pane using Control-b ↑ Control-b → or Control-b q 1, which is the session on the kaolin host.
Turn off the Postgres server by running:
In the top-left pane, you'll see the monitored table switch from kaolin to another node as the cluster subgroup picks a new leader. The script in the lower-right pane might show some errors as updates are canceled. However, as soon as a new leader is elected, it starts routing traffic to that leader.
Switch to the lower-left pane using Control-b ↓ Control-b ← or Control-b q 2, and run:
You'll see something like:
The kaolin node is down, and updates are going to a different write leader.
While kaolin is down, the logical replication at the heart of PGD is tracking how far out of sync kaolin is with the cluster. To see that for yourself, run:
This command displays the current replication rates between servers:
Looking at this output, you can see kaolin has a three-minute replay lag and around 292KB of data to catch up on if it came back now. The longer kaolin is down, the larger the replay lag gets. If you rerun the monitoring command, you'll see the numbers went up:
Another 46 seconds have passed, and the lag has grown by 74KB. Next, bring back the node, and see how the system recovers.
You can bring back the Postgres service on kaolin. Switch back to the top-right pane using Control-b ↑ Control-b → or Control-b q 1, and run:
You won't see any change. Although the database service is back up and running, the cluster isn't holding an election, and so the leader remains in place. Switch to the lower-left pane using Control-b ↓ Control-b ← or Control-b q 2, and run:
Now you'll see:
As soon as kaolin is back in the cluster, it begins synchronizing with the cluster. It does that by catching up on that replay data. Run:
The output looks like this:
As you can see, there's no replay lag now, as kaolin has completely caught up automatically.
With kaolin fully back in service, you can leave everything as it is. There's no need to change the server that's write leader. The failover mechanism is always ready to bring another server up to write leader when needed.
If you want, you can make kaolin leader again by running:
This command returns kaolin to write lead. The application's updates will follow, as the proxies track the write leader.
Proxies can also failover. To experience this, make sure your focus is still on the lower-left pane, and run:
exit to exit the enterprisedb user and return to the admin/root shell. You can now stop the proxy service on this node by running:
A brief error appears in the lower-right window as the script switches to another proxy. The write leader doesn't change, though, so the switch of proxy doesn't show in the top-left pane where the monitor query is running.
Bring the proxy service on kaftan back by running:
You can quickly exit Tmux and all the associated sessions. First terminate any running processes, as they otherwise continue running after the session is killed. Press Control-B and then enter
:kill-session. This approach is simpler than quitting each pane's session one at a time using Control-D or
This example uses the quick-start configuration of three data nodes and one backup node. You can configure a cluster to have two data nodes and a witness node, which is less resilient to a node failing. Or you can configure five data nodes, which is much more resilient to a node failing. With this configuration, you can explore how failover works for your applications. For clusters with multiple locations, the same basic rules apply: taking a server down elects a new write leader that proxies now point to.