Can PgBouncer survive transient network events?

February 14, 2020

 

This is the final part of my technical responses to a series of questions about the use of PgBouncer. To read the other posts in the series, use the links below. 

Part 1: Installing, configuring and testing persistent PostgreSQL connection pooling

Part 2: PgBouncer Connection Pooling: What to do when persistent connectivity is lost

Part 3: Can PgBouncer handle failover to a new machine?

Part 4: Can PgBouncer session survive everything we throw at it?

 

It’s been a fun ride, but it’s time to wrap up our series on PgBouncer. If we go back to transaction mode (discussed in Part 2 of this series) and run a large number of users at once, can we now survive a transient network event? Let’s find out. 

$ /usr/pgsql-11/bin/pgbench --client=200 --transactions=100 -U pgb -h 127.0.0.1 -p 6543 pgb

Password: 

starting vacuum...end.

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 1000

query mode: simple

number of clients: 200

number of threads: 1

number of transactions per client: 100

number of transactions actually processed: 20000/20000

latency average = 365.642 ms

tps = 546.982806 (including connections establishing)

tps = 546.987326 (excluding connections establishing)

 

Now we have our baseline. A very decent throughput and all transactions were processed.

Taking the Network Out at the Database

So what happens if we take the network out (twice) at the database end while the transactions are being processed?

$ /usr/pgsql-11/bin/pgbench --client=200 --transactions=100 -U pgb -h 127.0.0.1 -p 6543 pgb

Password: 

starting vacuum...end.

 

Then, on the database (twice).

# ifdown eth0 && ifup eth0

 

We get:

transaction type: <builtin: TPC-B (sort of)>

scaling factor: 1000

query mode: simple

number of clients: 200

number of threads: 1

number of transactions per client: 100

number of transactions actually processed: 20000/20000

latency average = 1132.345 ms

tps = 176.624646 (including connections establishing)

tps = 176.625162 (excluding connections establishing)

 

There’s a drop in throughput (however, that is to be expected) BUT! All transactions were processed, even while these transactions would have certainly been in flight while the database connection was down - not once, but twice. And PgBouncer even held its own connections above the maximum number of connections for the database while all this was going on.

Key Takeaways from the PgBouncer Series

I hope you’ve found this series of blog posts on PgBouncer useful and informative. As a reminder, they were designed to respond to a series of questions I received during a customer training session. These questions included:

  • When PgBouncer is external to a database or databases in replication, can it handle persistent connections during events like network outage, database restarts and even automated failover or controlled switchover between entirely different machines?
  • Making the minimal configuration changes possible, how does PgBouncer handle something as drastic as loss of connection to a database - and even a failover event - while still providing persistent connections (and as a bonus, still providing pooling on reconnection) when the connection to the database comes back up again?

Throughout the series I have attempted to address these questions, and would like to offer these conclusions:

  • The PgBouncer tool provides lightweight connection pooling for Postgres databases on:
    • both Community Edition and EDB Postgres Advanced Server,
    • both Linux and Windows, and
    • can be built from source or downloaded from a repository.
  • The PgBouncer tool can also be used externally to the database box to provide persistent connections. These connections can be persistent over network outages and during a database restart. 
  • Using EDB Failover Manager with VIP capability, connections can be persistent from manual switchover from Master to Standby (e.g. during maintenance) and during automated failover from a Master to Standby using EDB EFM. Also, by connecting to the VIP, connections are held “in-flight” while the VIP is transferred and the Standby is promoted to Master.
  • In transaction mode, the PgBouncer tool can be used to transparently navigate failure events for multiple users - even when not directly connected to the target database!
  • HOWEVER - in PgBouncer statement pool_mode:
    • Use of transactions is prohibited in any code using PgBouncer
    • Failure can lead to partial completion of processes in-flight
    • STATEMENT MODE SHOULD BE USED WITH EXTREME CAUTION!

 

Share this