You Can Now Pick Your Favorite Compression Algorithm For Your WALs!

October 24, 2022

Each new major PostgreSQL release includes new exciting features and enhancements in various areas. In this blog post, we are discussing some new changes in Postgres 15 that were qualified by the Postgres project as performance improvement.


WAL compression new algorithms

Since version 9.5, PostgreSQL offers the possibility to compress WAL records (when full-page writes are enabled) via the wal_compression parameter. From the documentation:

wal_compression (boolean)

When this parameter is on, the PostgreSQL server compresses full page images written to WAL when full_page_writes is on or during a base backup. A compressed page image will be decompressed during WAL replay. The default value is off. Only superusers can change this setting.

Turning this parameter on can reduce the WAL volume without increasing the risk of unrecoverable data corruption, but at the cost of some extra CPU spent on the compression during WAL logging and on the decompression during WAL replay.

Before PostgreSQL 15, when WAL compression was enabled, only one compression algorithm was available: pglz. pglz is the built-in compression method.

Since the following commits, the external compression methods lz4 and zstd are now supported: Add support for LZ4 with compression of full-page writes in WAL and Add support for zstd with compression of full-page writes in WAL.


Benchmarking

Let’s take a look at the impacts of the different algorithms in terms of write throughput and TPS. pgbench is used for generating database activity.


Inputs/Outputs

The following chart shows the write throughput captured on the storage device hosting the WAL files during pgbench execution when using different wal_compression values.

Chart 1


Unsurprisingly, enabling wal_compression, whatever the algorithm being used, has a significant positive impact on IOs. So, if your system is struggling with IO on WALs, you might consider compressing them. (On top of that, you might want to do that to also reduce bandwidth consumption for your backups and streaming replication systems.)

The following chart focus on comparing the different compression algorithms:

Chart 2


According to this last chart, we can conclude:

  • pglz provides slightly better compression than lz4
  • zstd provides better compression than pglz

Of course, like each time we explore performance, this is totally irrelevant to your specific use case. You will need to test it on your hardware/software with your specific workload to find out what compression algorithm might be the best for you.


TPS

Chart 3

When we look at the number of transactions per second, we will confirm that having a compression algorithm is better than not activating it. It also shows that the zstd compression algorithm is slightly better than the other two.

 

Conclusion

As always with performance, depending on your workload, code, network, hardware and software stack, changes can have little to no effect or make a massive difference. You need to benchmark to find out if it's worth activating it in your particular use case or not. 

If you're interested in performance improvements in Postgres 15, please look at the release note and learn about hash lookup for NOT IN clauses with many constants, SELECT DISTINCT parallelization, improvement of UTF-8 encoding validation, performance improvement for sorting operations. It is quite impressive how each new version of Postgres brings more performance improvements!

Share this