Linking versus copying v16

When invoking pg_upgrade, you can use a command-line option to specify whether to copy or link each table and index in the old cluster to the new cluster.

Linking is much faster because pg_upgrade creates a second name (a hard link) for each file in the cluster. Linking also requires no extra workspace because pg_upgrade doesn't make a copy of the original data. When linking the old cluster and the new cluster, the old and new clusters share the data. After starting the new cluster, you can no longer use your data with the previous version of EDB Postgres Advanced Server.

If you choose to copy data from the old cluster to the new cluster, pg_upgrade still reduces the time required to perform an upgrade compared to the traditional dump/restore procedure. pg_upgrade uses a file-at-a-time mechanism to copy data files from the old cluster to the new cluster versus the row-by-row mechanism used by dump/restore. When you use pg_upgrade, you avoid building indexes in the new cluster. Each index is instead copied from the old cluster to the new cluster. Finally, using a dump/restore procedure to upgrade requires a lot of workspace to hold the intermediate text-based dump of all of your data, while pg_upgrade requires very little extra workspace.

Data that's stored in user-defined tablespaces isn't copied to the new cluster. It stays in the same location in the file system but is copied into a subdirectory whose name shows the version number of the new cluster. To manually relocate files that are stored in a tablespace after upgrading, move the files to the new location and update the symbolic links to point to the files. The symbolic links are located in the pg_tblspc directory under your cluster's data directory.