PG Phriday: Terrific Throughput Tracking

January 25, 2019

Postgres has a lot of built-in information functions that most people don’t know about. Some of these are critical components to identifying replication lag, but what if we could use them for something else, like throughput?

This man’s one simple trick can track actual database throughput; DBAs hate him!

Everybody Knows

Let’s take a look at a common query we might use to track replication lag between a Postgres 11 Primary and one or more Replica nodes.


SELECT client_addr,
       pg_wal_lsn_diff(
           pg_current_wal_lsn(),
           sent_lsn
       ) AS sent_lag,
       pg_wal_lsn_diff(
           pg_current_wal_lsn(),
           write_lsn
       ) AS write_lag,
       pg_wal_lsn_diff(
           pg_current_wal_lsn(),
           flush_lsn
       ) AS flush_lag,
       pg_wal_lsn_diff(
           pg_current_wal_lsn(),
           replay_lsn
       ) AS replay_lag
  FROM pg_stat_replication;

This gives us the number of bytes the Primary has transmitted to remote servers, given a number of different metrics. How much have we sent? Has that data been written to disk? How much has been synced and is thus safe from crashes? How much has actually been replayed on the Postgres data files? For more information on the pg_stat_replication view, check out the documentation.

If we were using slots instead, we might do this:


SELECT slot_name,
       pg_wal_lsn_diff(
         pg_current_wal_lsn(),
         restart_lsn
       ) as restart_lag,
       pg_wal_lsn_diff(
         pg_current_wal_lsn(),
         confirmed_flush_lsn
       ) as flush_lag
  FROM pg_replication_slots;

There is somewhat less information in pg_replication_slots, as it only really tells us the minimum LSN position the Primary needs to retain for that replica. Of course, if that stops advancing while the Primary keeps working, that’s lag we can see even while the replica is disconnected.

This is especially important with replication slots, since they necessarily mean the Primary is retaining WAL files a replica may need. We don’t want the replay lag value to get too high, or the primary node could run out of disk space. That is definitely something we want to monitor.

That’s How it Goes

It’s also a hint as to our database throughput. If we "pin" the value by disconnecting a replica, the value will continue to increase at the same rate the Primary node produces WAL traffic.

But wait a minute; what good is database throughput? It must be pretty useful, as developers have included several indicators in the pg_stat_database view. That view will tell us how many inserts, updates, and deletes a database may have handled, as well as commit versus rollback activity.

Despite all that (and much more), this statistical view won’t tell us anything about data volume, plus it isn’t persistent, and replicas maintain their own stats independently. What if there was a way to determine how active our Primary database is forever, from any node, any time we can take two readings separated by time?

Ah Give or Take a Night or Two

Consider a brand new fresh instance that has just been bootstrapped with initdb. It has processed essentially no transactions, contains no tables, and isn’t currently doing anything. Had we known we’d want to track overall instance activity, we may have called pg_current_wal_lsn() to retrieve the starting LSN.

For example:


SELECT pg_current_wal_lsn();

 pg_current_wal_lsn 
--------------------
 0/1652EC0

In theory, we could use this value on any future call to pg_wal_lsn_diff to obtain the amount of bytes produced by the database instance since its inception. Unfortunately, it’s hardly ever the case we have such an opportunity unless we created the Postgres instance ourselves.

But what is really the difference between 0/1652EC0, and the absolute zero value of 0/0?


SELECT pg_wal_lsn_diff('0/1652EC0', '0/0');

 pg_wal_lsn_diff 
-----------------
        23408320

That’s not even 24MB, which is less than two 16MB WAL files. In that case, we can safely substitute 0/0 and suddenly any execution of pg_current_wal_lsn is trivially transformed into an absolute amount of data produced by our Postgres instance.

And a long stem rose

Let’s initialize a medium 100-scale pgbench instance and check the amount of bytes reported at the end.


SELECT a.total_bytes, 
       (a.total_bytes/1024/1024)::BIGINT AS mbytes
  FROM pg_wal_lsn_diff(pg_current_wal_lsn(), '0/0') a (total_bytes);

 total_bytes | mbytes 
-------------+--------
  1328940592 |   1267

So Postgres has handled about 1267 MB of data since we started. How about after processing a one-minute pgbench test?


 total_bytes | mbytes 
-------------+--------
  1422082032 |   1356

What if we add another index to the pgbench_accounts table?


CREATE INDEX idx_account_balance ON pgbench_accounts (abalance);

 total_bytes | mbytes 
-------------+--------
  1626342784 |   1551

We included that last example to illustrate that this really measures WAL traffic throughput, thus anything that produces WAL traffic will drive the value higher. It’s not just writes to the instance, but anything that modifies its contents. Even read-only transactions will slowly drive this value forward, as these require various database resources including transaction identifiers.

Is there a better cumulative indicator for how much real actual work a Postgres instance is performing?

Before it Blows

Real work is the key to this metric. We could send the query to any number of monitoring tools, and so long as they can compare previous values across time, we now have a total activity chart for our Postgres instance.

Did something perform a large bulk insert? Is client activity producing a steady stream of data and transaction traffic that’s threatening to overwhelm our hardware? Do we need to reevaluate resource allocation to prevent such throughput spikes from negatively impacting overall performance?

These are all questions the knowledge of Postgres throughput can provide. If the graphs show an upward trend, we can even plan for future expansions. It’s a cool piece of data that few, if any, existing monitoring scripts such as check_postgres bother to integrate.

Being proactive is always preferable to the alternative, so adding yet another query to our monitoring toolkit is always a step forward. Don’t forget to watch those charts.

And everybody knows that it’s now or never.

Share this

More Blogs