From time to time I see questions from otherwise well informed people about how the PostgreSQL Build farm checks how pg_upgrade checking is done across versions, e.g. how does it check upgrading from release 9.5 to release 18. I realize that this isn't well documented anywhere, so here is a description of the process.
All of the code referenced here can be found at https://github.com/PGBuildFarm/client-code.
The principal buildfarm client script is run_build.pl, which builds and tests a single branch (e.g. REL_17_STABLE). There is another script called run_branches.pl, which calls this run_build.pl for several branches, but under the hood each invocation of the main script is still for a single branch. At the end of each invocation of run_build.pl, the build artifacts are by default discarded.
However, buildfarm animals can enable a special module called TestUpgradeXversion. This module runs after all the installcheck pieces have been run in the "C" locale.
The first thing it does is to save the installed software from the current run (the "bin", "lib", "include" and "share" directories), as well as the data directory from all the install checks it has just run. It then starts up the copied instance and adjusts the location of all the loadable libraries, which it points to the newly copied location, so that there is no longer any dependence on the build or install directories of the current run. Then this instance is stopped. This saved location and data can now be used by this or later runs as the source to be upgraded.
Then it looks for all the saved versions it has. For this branch it will be from what it just saved, but for other branches it will be things saved in previous runs of run_build.pl. From this list it selects those that that are earlier or equal to the branch it's currently running. e.g. if it's building release 15, it only looks for branches older than or equal to release 15. Then it tries to upgrade each of these selected data directories to its current branch.
The upgrade procedure starts by making a copy of the saved data directory, and makes some minor adjustments to its config to allow the upgrade to run. Then it starts the source instance and if the source and target releases are not the same makes some required adjustments. These adjustments are specified in a perl module in the PostgreSQL source code (see https://github.com/postgres/postgres/blob/master/src/test/perl/PostgreSQL/Test/AdjustUpgrade.pm). Originally these adjustments were part of the buildfarm module, but they were moved several years ago so that they are now under the control of the PostgreSQL committers, and if changes become necessary no change is needed to the buildfarm client. Once those adjustments have been made, the module takes a logical copy of the cluster using pg_dumpall from the current version, and then stops the instance. Then it creates a new cluster directory for the current version and runs pg_upgrade on the copy of the old version into the new cluster. Then it starts the new cluster. On versions prior to release 18, it runs analyze on all the databases - this is no longer necessary as in modern version we import the stats. If there are hash indexes that need reindexing, this is done. Then it takes a logical copy of the upgraded cluster, and checks the database integrity using pg_amcheck, and makes sure it can update all the extensions, and removes the old cluster. Finally, using the procedures from the AdjustUpgrade module mentioned above, if the source and target versions are not the same, it adjusts both the dump of the old cluster and the dump of the upgraded cluster so that they should be strictly comparable. If the dumps are not identical after these adjustments, it uses diff to find where they differ. If all these steps succeed, the module reports success, otherwise it reports a failure.
There are some things to note. First, the module does not itself build Postgres or run installchecks. Instead it leverages previously done work in this or previous instances of run_build.pl. That's important, because it allows us to test upgrades from versions on which we are no longer running the full buildfarm client. For example, the buildfarm animal crake runs on all the maintained branches, essentially that's currently release 14 and newer. But it tests upgrades from as far back as release 9.2, the earliest version for which we claim pg_upgrade can be run. That's because it has those versions saved using the procedure outlined above. That can be significant because older versions are not always that easy to build. Some years ago when I was setting this up on the buildfarm animal drongo , which builds using Microsoft C, it proved to be just about impossible to build suitable instances earlier than release 9.5, so that's the earliest that it tests.
Why don't we enable this by default? Basically, time and space. On a relatively fast instance, this can take more than 20 minutes. On a slowish instance it can take more than an hour. Ideally, most buildfarm animals would run each branch in a few minutes. Also, it consumes large amounts of disc space. The animal crake at rest uses about 17Gb of space to support this module, and another Gb or two while processing. That's a lot to ask of buildfarm owners.
I am experimenting in compressing some of that data, and I'm hopeful that we can reduce the disk requirements substantially. But the time issue remains, and is not likely to be reduced unless someone comes up with something radically different. Indeed, compressing and decompressing data will add to the time consumed. although I hope not a lot.