CloudNativePG (CNPG) has been adopted by many users and companies at this point. Thanks to their feedback it has been improved constantly, but a few features are yet to be implemented. One of the most missing features was the ability to upgrade PostgreSQL to a higher major version, declaratively. Not anymore.
The 1.26 version of CloudNativePG introduces the declarative offline in-place major upgrades of PostgreSQL made possible through the use of the Postgres native tool pg_upgrade
.
In this blog post, we will see how easy it is to upgrade a CNPG cluster with different PostgreSQL configurations to a higher version with just a tag replacement in the YAML manifest.
What is an Offline In-place Major Upgrade?
In PostgreSQL, when we talk about major upgrades most of the SysAdmin/DBAs might have a hard time sleeping at night. Not because it’s impossible, but because it’s an operation that requires time to prepare and test, before actually performing it. It could happen that sometimes it fails, requesting manual intervention from the SysAdmin/DBA to perform a rollback.
Why? While a minor version upgrade is usually as easy as replacing the PostgreSQL binary, this is not true for the major version upgrade. The reason is the binary incompatibility between major versions, which determines a different format of the data stored on the disk. For example, a PostgreSQL 17 server can’t read a data directory created by a PostgreSQL 16 binary, without the proper transformation. Usually, one solution is to let PostgreSQL 17 digest all the data at logical level by performing a dump and restore of the PostgreSQL 16 database. Or, using pg_upgrade
, to directly translate the existing PostgreSQL files at binary level on the data directory to be compatible with the target server version.
The above methods are considered offline upgrades to distinguish them with the ones using logical streaming replication. Indeed, online upgrades allow both servers and applications to be online at the same time while migrating data. In this scenario, the switchover to the new server version takes just the required time to point the application to the new PostgreSQL server. This is also known as “blue/green deployment upgrade”. Differently, the applications must be shut down when performing an offline upgrade, for the entire period required to migrate the data.
What's new then?
While the logical streaming replication is the relatively new technology that can help mitigate the burden of a PostgreSQL major version migration, it has its own limitations. For example it can not automatically synchronize the sequence on the target server. Also, it does require a new cluster to be up and running to synchronize data from the original one, doubling the amount of nodes and data. Just for the completeness of the information: CNPG provides support for Logical Replication as well.
The new “offline in-place major upgrade” feature of CNPG, as the sentence suggests, replaces the current cluster with the new version “in place”. This is faster than running a pg_dump
and a pg_restore
, and it does save disk space by replacing only the required files while reusing most of the content in the data directory. More importantly, it just requires increasing the major version of the Postgres image inside the YAML definition, making the entire procedure transparent, and achieving a declarative PostgreSQL major upgrade.
How?
As mentioned, CNPG takes advantage of the pg_upgrade
binary to translate the two PostgreSQL versions. To speed up the procedure, the binary is executed with the --link option
, to create hard links between files from the source data directory to the target one. This way, changes are applied only to the required files, saving time and space for the entire process.
There are a few ways to manage PostgreSQL versions in CNPG: one is using the imageName
in the YAML manifest of the Cluster CRD; another is using the ImageCatalog CRD referenced in the Cluster manifest. Any change to the PostgreSQL version on either the Cluster or the ImageCatalog resource will trigger the Operator to start the upgrade procedure.
The procedure will be as follows:
- Shuts down all cluster pods to ensure data consistency.
- Initiates a new upgrade job, which:
- Verifies that the binaries in the image and the data files align with a major upgrade request.
- Creates new directories for
PGDATA
, and where applicable, WAL files and tablespaces. - Performs the upgrade using
pg_upgrade
with the--link option
. - Upon successful completion, replaces the original directories with their upgraded counterparts.
Let's try it
Assuming we already have a 3 instance CNPG cluster, in synchronous replication, with PostgreSQL 13 up and running:
cat cluster-example.yaml
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: cluster-example
spec:
imageName: ghcr.io/cloudnative-pg/postgresql:13.21
instances: 3
storage:
size: 1Gi
postgresql:
synchronous:
method: any
number: 1
kubectl get cluster
NAME AGE INSTANCES READY STATUS PRIMARY
cluster-example 15m 3 3 Cluster in healthy state cluster-example-1
kubectl get cluster cluster-example -o jsonpath='{.spec.imageName}'
ghcr.io/cloudnative-pg/postgresql:13.21
Let’s create a table and insert some data into it:
kubectl cnpg psql cluster-example -- app
psql (13.21 (Debian 13.21-1.pgdg110+1))
Type "help" for help.
app=# create table numbers(x int);
CREATE TABLE
app=# insert into numbers (select generate_series(1,10000));
INSERT 0 10000
app=# select count(*) from numbers ;
count
-------
10000
(1 row)
We will now change the imageName
to PostgreSQL 17.5 in the cluster, while watching the resources from another terminal:
kubectl patch cluster cluster-example \
--type='json' \
-p='[{"op":"replace", "path":"/spec/imageName", "value":"ghcr.io/cloudnative-pg/postgresql:17.5"}]'
kubectl get pods -w
NAME READY STATUS RESTARTS AGE
[...]
cluster-example-1 1/1 Terminating 0 34m
cluster-example-2 1/1 Terminating 0 33m
cluster-example-3 1/1 Terminating 0 33m
cluster-example-1-major-upgrade-vg9hc 0/1 Pending 0 0s
cluster-example-1-major-upgrade-vg9hc 0/1 Init:1/2 0 1s
cluster-example-1-major-upgrade-vg9hc 0/1 PodInitializing 0 2s
cluster-example-1-major-upgrade-vg9hc 1/1 Running 0 99s
cluster-example-1-major-upgrade-vg9hc 0/1 Completed 0 2m6s
cluster-example-1 0/1 Pending 0 0s
cluster-example-1 0/1 Init:0/1 0 0s
cluster-example-1 0/1 PodInitializing 0 1s
cluster-example-1 0/1 Running 0 2s
cluster-example-1 1/1 Running 0 11s
[...]
The primary only has been upgraded, while new replicas have been created from it (with increased number in the name):
kubectl get pods
NAME READY STATUS RESTARTS AGE
cluster-example-1 1/1 Running 0 5m25s
cluster-example-4 1/1 Running 0 4m56s
cluster-example-5 1/1 Running 0 4m24s
kubectl cnpg status cluster-example-custom
Cluster Summary
Name default/cluster-example-custom
System ID: 7512075037961891879
PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:17.5 # New image
Primary instance: cluster-example-custom-1
Primary start time: 2025-06-04 12:12:02 +0000 UTC (uptime 23h50m54s)
Status: Cluster in healthy state
Instances: 3
Ready instances: 3
[...]
Let’s check the data as well:
kubectl cnpg psql cluster-example -- app
psql (17.5 (Debian 17.5-1.pgdg110+1))
Type "help" for help.
app=# select count(*) from numbers ;
count
-------
10000
(1 row)
Now let’s recreate the statistics information in the database:
kubectl cnpg psql cluster-example -- app
psql (17.5 (Debian 17.5-1.pgdg110+1))
Type "help" for help.
app=# VACUUM ANALYZE;
Done!
Caveats
- Replicas are destroyed and recreated after the upgrade is successfully completed (it takes time to recreate them in case of very large databases)
- Downtime (of course), but smaller than other offline methods
- It requests manual intervention in case the procedure fails
Important notes
PostgreSQL Statistics
Remember to execute a VACUUM ANALYZE
command once the upgrade is complete, in order to recreate the database statistics required to make PostgreSQL’s planner use the most efficient way to run a query. At the moment of writing this, CNPG does not perform these PostgreSQL commands autonomously. This feature could be available in the future.
PostgreSQL Base Image
In order to upgrade to a PostgreSQL image that is based on a different distro version, you need to first upgrade the current PostgreSQL version to use the new base distro image. For example: if you run PostgreSQL 17 image based on Debian Bullseye, and want to upgrade to PostgreSQL 18 Alpha image which is currently based on Debian Bookworm, you need to first replace the PostgreSQL 17 image with one based on Debian Bookworm. This is due to missing dependencies in the Bookworm based image, that are required for the transition from older PostgreSQL than version 18.
Conclusions
CloudNativePG is already a powerful Kubernetes Operator that takes care of the whole lifecycle of PostgreSQL clusters. With the introduction of this new feature it gets more power by performing the PostgreSQL major upgrades automatically. This will further help the current users and DBAs to transition to the latest major version of PostgreSQL in a smoother way, saving time, space, and sleep at night!
Based on the practical test we made above, we discovered that the CNPG Operator takes care of every step required for performing the upgrade. Whether your cluster is running on a single, or a multi instance configuration, and/or using the synchronous replication, the Operator will know exactly how to seamlessly proceed for upgrading the cluster.
Automation is the key to empower your development, testing, and production environment, and CNPG helps you integrate it in your continuous delivery pipelines.
CloudNativePG is a project in the Cloud Native Computing Foundation (CNCF) Sandbox. It is entirely open-source and vendor-neutral. EDB is the creator of CloudNativePG and serves as the primary contributing organization, having six maintainers. If you're interested in how EDB can assist you in accelerating your migration to Kubernetes, please contact us.