Trusted Postgres Architect 23.44.0 release notes v23.44.0

Released: 10 June 2026

New features, enhancements, bug fixes, and other changes in Trusted Postgres Architect 23.44.0 include the following:

Highlights

  • Support for upgrading components in a PGD-X cluster
  • Improved robustness of PGD upgrades
  • The EFM user is no longer a superuser by default

Enhancements

DescriptionAddresses
TPA now supports upgrade of components in the PGD-X architecture.

In the upgrade playbook for PGD-X, additional steps that ensure the upgrade of selected components (PgBouncer, pg-backup-api, Barman, PEM server) via tpaexec (tpaexec upgrade (...) --components=<component>) have been included, so individual components can be upgraded without the need to upgrade the entire cluster. This upgrade process is included as a standalone feature for component upgrades, as well as minor PGD-X upgrades and major upgrades to the PGD-X architecture.

TPA can now deploy the EFM database user with only the minimum required permissions.

On a regular EFM deployment, TPA granted the EFM user SUPERUSER privileges. A new variable called efm_user_is_superuser has been introduced, where TPA deploys the EFM user as a Postgres SUPERUSER, or as a regular login user that is a member of efm_role, a role created by TPA with only the specific catalog functions and predefined roles (pg_monitor, pg_read_all_settings, pg_read_all_stats) that EFM requires (the variable is set to false by default when creating new clusters via tpaexec configure; removing this variable from the configuration file causes efm_user_is_superuser to be treated as true).

The variable acts as a switch and can be toggled with a regular deployment, and can be used on already deployed clusters to switch the privileges of the EFM user.

TPA now supports Ansible Automation Platform 2.6.

TPA is now tested with AAP 2.6. Existing execution environments are compatible with both AAP 2.4 and AAP 2.6.

Added support for proxy monitoring during PGD 5 to PGD 6 upgrades.

The enable_proxy_monitoring=yes option to tpaexec upgrade can now be used during upgrades from PGD-Always-ON (PGD 5) to PGD-X (PGD 6). Previously it would cause the upgrade to fail immediately because the monitor attempted to connect to the Connection Manager port on witness nodes, which do not run Connection Manager. Witness nodes are now excluded from the list of endpoints the monitor targets.

TPA now supports non-default hugepage sizes.

TPA's hugepages settings previously assumed the architecture's default page size (2MB on x86_64), so a configuration that wanted Postgres to use 1GB hugepages required the user to set the kernel command line, the vm.nr_hugepages sysctl, and the huge_page_size GUC by hand, and the values TPA generated worked against them. TPA now supports ahuge_page_size variable (a Postgres-style memory string such as1GB). When set, TPA reserves pages of that size on the kernel command line with hugepagesz=, omits vm.nr_hugepages from/etc/sysctl.conf (which only ever applies to the architecture's default-size pool), and sets huge_page_size in postgresql.confso Postgres draws from the chosen pool. The page count can be overridden with a new nr_hugepages variable; for backwards compatibility, an existing sysctl_values['vm.nr_hugepages'] is still honoured when nr_hugepages is not set explicitly. Because every reservation is taken out of normal memory at boot,huge_page_size is usually best set on individual Postgres instances rather than cluster-wide.

Added an environment variable option to the repmgr service unit file.

A new variable called repmgr_service_environment has been added to the repmgr service unit template file for repmgr. This variable allows users to specify custom environment variables that will be set for the repmgr service.

The repmgr_service_environment variable can be defined in the cluster configuration file, as a dictionary where the keys are the names of the environment variables and the values are the corresponding values for those variables.

This is useful in environments where the repmgr process requires access to specific runtime values, such as custom library paths or authentication credentials.

56200
Added supported cluster_vars overrides for TPA's choice of Python interpreter and packages.

Two new (or newly-effective) cluster_vars now control how TPA uses Python on cluster nodes:

  1. script_python_interpreter — templates the shebang line of TPA- installed Python scripts (for example /etc/tpa/postgres-monitor). Defaults to /usr/bin/env {{ python }}, preserving the previous PATH-based behaviour. Users can set it in cluster_vars to an absolute interpreter path to pin scripts deterministically — useful on hosts that have more than one Python installed and where the default env lookup would find the wrong one.

  2. python_pkg_prefix — controls the name prefix TPA uses when constructing OS package names (for example python3-psycopg2versus python311-psycopg2). It was already an internal fact, but setting it in cluster_vars previously had no effect becauseminimal_setup's output overwrote it. It is now passed through tominimal_setup as a module parameter, so a user-supplied value is honoured.

See docs/src/python.md for details, including the recommendation to couple the two variables on any host that has more than one Python interpreter installed.

TPA now supports enable_proxy_monitoring during PGD 6 minor upgrades.

The enable_proxy_monitoring=yes option to tpaexec upgrade now works for PGD 6 to PGD 6 minor upgrades, in addition to the major upgrade paths where it was already supported. When enabled, TPA records Connection Manager downtime on each data node during the rolling fence/restart cycle.

Improved restart and service excluded_tasks coverage.

TPA now respects the restart and service values of excluded_tasksacross all roles that perform service restarts or systemd service management, including Barman, beacon-agent, EFM, etcd, harp, Patroni, PEM (agent and server), pgbackupapi, PgBouncer, PGD Proxy, Postgres, repmgr (restart, switchover, replica final), rsyslog, and OpenVPN.

Previously, users excluding restart or service tasks could still see unintended restarts in these areas during a deploy. With this change, all restart and direct service operations are guarded consistently, so excluded_tasks reliably suppresses them.

tpaexec reconfigure now accepts --bdr-package-version.

tpaexec reconfigure now accepts --bdr-package-version on BDR-Always-ON, PGD-Always-ON, PGD-X, and PGD-S clusters, recording the value in config.yml and applying any version-gated configuration options the chosen version requires. Currently the only such option is read_listen_port in default_pgd_proxy_options for PGD 5.5+, matching the behaviour of tpaexec configure for PGD 5.5 clusters. The new logic also runs as a require() of the BDR-Always-ON → PGD-Always-ON and PGD-Always-ON → PGD-X architecture changes, so those upgrade paths pick up the version-gated options when no --bdr-package-version is supplied.

TPA now supports enable_proxy_monitoring for PGD 5 minor upgrades.

The enable_proxy_monitoring option now works during PGD 5 minor version upgrades in PGD-Always-ON clusters. Previously, proxy monitoring was only supported during BDR 4 to PGD 5 major upgrades. When enabled, the proxy monitor tracks connection availability through pgd-proxy endpoints throughout the upgrade process and reports any interruptions.

Changes

DescriptionAddresses
Replaced upgrade_legacy.yml with a dedicated upgrade_major_4to5.yml playbook.

The BDR 4→5 major upgrade logic has been extracted from the monolithic upgrade_legacy.yml into a dedicated playbook. This simplifies the upgrade path by removing all conditional branching, hardcoding harp as the failover manager and upgrade_from as version 4. The now-orphaned upgrade_legacy.yml files have been removed from PGD-Always-ON and Lightweight architectures.

Added a dedicated upgrade_minor_5.yml playbook for BDR 5.x minor upgrades.

This new playbook handles all BDR 5.x minor upgrades. This uses the relevant logic previously embedded in the monolithic upgrade_legacy.yml, making the upgrade process easier to maintain.

TPA no longer passes empty --team and --owner to pemworker.

When registering a Postgres server with PEM, TPA previously passed--team "" and --owner "" to pemworker --register-server if themonitoring_team or monitoring_server_owner variables were not set. Earlier PEM versions treated these empty strings as equivalent to omitting the option, but PEM 10.5 tightens CLI validation and rejects empty strings, causing the registration task to fail. TPA now omits these flags entirely when the corresponding variables are unset, so server registration succeeds against PEM 10.5 without requiring any configuration change.

Documented how to run custom playbooks on AAP.

The AAP user documentation now explains how to add custom playbooks to a cluster and run them through AAP, using the commands/subdirectory convention and AAP's Duplicate template action.

Added missing flags to the tpaexec help output.

'tpaexec help' was missing some flags that are actually supported. This change adds those flags to the help output.

TPA now rejects pgaudit on EPAS clusters at the start of deploy.

Including pgaudit in extra_postgres_extensions on an EPAS cluster used to result in a deployment that failed late at Postgres startup, because pgaudit is not loadable on EPAS. TPA now detects this combination at the start of deploy and fails with a message recommending EPAS's built-in audit features.

Added {stop,start,list}-containers to the tpaexec --help output.

The already implemented options for managing docker containers in tpaexec were featured in the documentation but not in the output of tpaexec --help. This change ensures that the tpaexec --help command contains a brief summary of what these commands do.

Bug Fixes

DescriptionAddresses
Fixed an issue whereby tpaexec reconfigure would not add postgres_distributed during BDR 4 to BDR 5 upgrades.

The tpaexec reconfigure command, when upgrading from BDR 4 to BDR 5, didn't add thepostgres_distributed repository, which would result in a failure when trying to download the new BDR 5 packages. The fix ensures that the postgres_distributed repository is included in the configuration.

Fixed an issue whereby standby nodes could be promoted to BDR primary candidates.

An issue was found where standby nodes could be selected as BDR primary candidates during deployment, which could lead to unintended consequences in some cluster scenarios (for example, when joining BDR node groups via standby nodes). This fix ensures that standby nodes are excluded from the list of potential BDR primary candidates.

Fixed an issue whereby SLES 15 deployments failed when featuring patroni with EDBPGE and EPAS.

When selecting patroni as the failover manager with the EDBPGE/EPAS PostgreSQL flavour, the modules needed to install etcd-related packages were missing, and some additional adjustments were required to make the deployment work correctly. This fix ensures that the required packages are installed when deploying on SLES 15 with patroni as the failover manager, alongside some minor adjustments.

Fixed an issue where requesting PEM agent <9.6 would cause deployment to fail.

TPA uses the pemworker --enable-probe option to enable EFM probes where required. However this option does not exist before PEM 9.6, so trying to call it causes an error. This fix adds a version check to this task meaning it will be skipped when the agent version is <9.6.

Fixed an issue whereby pgd-cli tasks would run when pgd-cli was not upgraded.

During a postgres-only upgrade (the default), the tasks that log pgd-cli diagnostic output and wait for write leader elections were always executed on the first BDR primary, even when pgd-cli had not been upgraded as part of that run. This caused the upgrade to fail with "unknown command show-groups" because the old pgd-cli binary does not support that subcommand. Both tasks are now skipped unless pgdcli or all components are included in the upgrade.

Updated default AWS AMIs to currently-available images.

The default AWS AMIs that tpaexec configure selects for officially supported platforms have been updated to currently-available images. Previously several AMIs (notably Ubuntu 22.04, Ubuntu 24.04, RHEL 8.10 and SLES 15 SP7) had been deregistered by their vendors, causingtpaexec provision to fail for new clusters on those platforms. RHEL 9 and Rocky 9 also move from minor version 9.5 to 9.7, since 9.5 is no longer the current minor and updated 9.5 AMIs are not being published.

TPA now derives bdr_version_num automatically from installed BDR/PGD packages.

TPA uses bdr_version_num to make precise decisions about CAMO configuration. Previously this value was obtained only by querying the running database, so on an initial deploy (when Postgres is not yet running) parts of the CAMO configuration could not be rendered correctly, and users had to set bdr_version_num manually in config.yml as a workaround. TPA now derives bdr_version_num from the version of the installed BDR or PGD package on each node, so no manual setting is required. A value set in config.yml continues to take precedence, and once the database is running the precise value reported by bdr.bdr_version_num() remains authoritative.

Stabilised the post-upgrade health check for PGD 6 minor upgrades.

After the last rolling restart of a PGD 6 minor upgrade, BDR replication slots can briefly remain inactive while they re-handshake with the just-updated node. The post-upgradepgd cluster show --health check could fire during that window and report a spurious "Replication Slots Critical" failure. TPA now polls bdr.group_replslots_details for up to two minutes waiting for all slots to become active before running the strict health check. If the window is exhausted, the health check runs anyway so genuine problems are still reported.

Fixed an issue with custom rc-local service creation.

TPA creates a custom systemd service file to ensure the rc-local script is running on startup on distributions that don't support it out of the box. This fix changes the location of the service file to comply with conventions and avoid failures with a missing parent folder that could happen with the previously chosen path. This fix also ensures that this service file creation task is only applied when it is actually needed.

TPA now normalises Postgres Extended to pgextended and edbpge based on architecture.

A fix has been introduced to normalise Postgres Extended flavours based on architecture. When choosing any Postgres Extended variant via tpaexec configure (--pgextended, --edbpge, --edb-postgres-extended, or --postgres-flavour pgextended/edbpge), BDR-Always-ON or older architectures now normalise to pgextended, whereas newer architectures normalise to edbpge. Trying to deploy a pgextended flavour cluster on an incompatible architecture will result in an error, preventing misconfiguration.

Fixed an issue whereby read_listen_port was added to config for BDR versions below 5.5.

tpaexec configure added read_listen_port to default_pgd_proxy_options even for older BDR versions. A deploy-time assertion was also added to catch a missing read_listen_port when upgrading to PGD 5.5 or later.

TPA now rejects invalid cluster names at configure time.

Previously, tpaexec configure accepted cluster names that contained characters such as periods (for example, when the cluster directory name embedded a version string like v1.1.0-rc.1). The configuration was written successfully, but a later tpaexec provision then failed with a fatal assertion because the cluster name did not match the required pattern ^[_a-zA-Z0-9-]+$. tpaexec configure now applies the same check up front and rejects invalid names before any cluster directory is created, so the problem is reported immediately rather than several steps later.

Upgrade playbooks now use systemctl instead of pgrep.

Major and minor PGD upgrade playbooks used pgrep to decide whether harp-proxy or harp-manager was still running on a node. On hosts where procps was not installed, two of those checks silently misread "binary missing" as "process not running": the BDR 4→5 upgrade then skippedharpctl unmanage cluster before stopping harp-manager, and the PGD 5 minor upgrade ran the wrong proxy health check.

The checks now use systemctl is-active --quiet, which works on every distro TPA supports and reflects the authoritative service state.

TPA now omits pgaudit from CIS compliance on EPAS clusters.

Previously, tpaexec configure --compliance cis unconditionally added pgaudit to extra_postgres_extensions. This caused startup failures on EPAS, which has built-in audit logging, sopgaudit is now omitted on EPAS clusters.

TPA now fails deploy when pgbouncer and pgd-proxy share a listen port.

When an instance had both pgbouncer and pgd-proxy in its role list and both services were configured to listen on the same port (the default, 6432, on each), only one could bind. pgBouncer typically won the race and pgd-proxy was left in a failed state, but TPA's deploy returned success. The misconfiguration only became visible during the first upgrade, as a FATAL: SSL required error from the TLS-protection test connecting to pgBouncer instead of pgd-proxy.

TPA now rejects this configuration at deploy time with a clear message naming the affected host and the two variables involved (pgbouncer_port and pgd_proxy_options.listen_port). Users who co-host pgbouncer and pgd-proxy must set one of them to a non-default value in config.yml so the two listeners don't collide.

TPA now installs pgd-cli on pgd-proxy nodes during the 4to5 upgrade.

The BDR-Always-ON to PGD-Always-ON upgrade adds the pgd-proxy role to former harp-proxy nodes, but pgd-cli was never installed on those nodes during the upgrade. The post-upgrade diagnostic then failed with "No such file or directory".

The pgdcli-upgrade plays now also target pgd-proxy hosts, so pgd-cli is in place when the diagnostic runs.

Fixed an issue whereby automatic witness node addition did not work for PGD-X clusters with even data nodes.

When configuring a PGD-X cluster with an even number of--data-nodes-per-location, tpaexec configure now correctly adds a witness node to each location. This ensures Raft consensus can be established without requiring the user to explicitly pass--add-witness-node-per-location. The behaviour is now consistent with the PGD-Always-ON architecture and the documented behaviour for PGD-X.tpaexec configure now also rejects --data-nodes-per-location values below 2, which previously produced an invalid cluster with no data nodes.

59694
TPA now includes the correct mod_wsgi module for PEM server version 10.4.0 and higher.

This change ensures that the correct mod_wsgi module is included for PEM server versions 10.4.0 and higher, while preserving the existing behaviour for earlier versions.

TPA now includes additional modules to deploy EPAS in SLES 15.

When deploying a SLES 15 cluster, PackageHub registration is now performed as part of the initial system registration for both EPAS and PEM installations. Additionally, two new modules (sle-module-desktop-applications and sle-module-development-tools) are now enabled for EPAS deployments in order to install dependencies such as libclang13.

TPA now auto-creates the switch2cm.yml link on older cluster directories.

tpaexec switch2cm failed with "the playbook: commands/switch2cm.yml could not be found" on clusters that were originally configured with a TPA release predating the switch2cm command, because the cluster directory had no link to architectures/PGD-Always-ON/commands/switch2cm.yml.

tpaexec switch2cm now calls tpaexec relink automatically when the link is missing, matching the behaviour tpaexec upgrade already had for its own command link.

Fixed an issue whereby HARP manager symlink creation failed on nodes 2+ during upgrade.

During a rolling upgrade to edbpge, the task that creates a symlink at /var/run/postgresql/.s.PGSQL.<port> for HARP manager to connect to the database would fail on all nodes after the first with "refusing to convert from file to symlink". The destination path could hold a real socket file left behind by the pre-upgrade postgres (either from a flavour migration or from an unclean stop). Adding force: true to the symlink task in all affected upgrade playbooks ensures the symlink is created correctly regardless of the prior state of that path.

Introduced retry logic between BDR replication slots when upgrading.

When upgrading the postgresql package version in a BDR cluster, replication slots may take a few seconds to reconnect, and the upgrade process can have an exit error due to not allocating enough time to the recovery process. The fix adds retry logic to the health check on the upgrade process to allow time for the slots to recover and avoid timing-related failures.

TPA now reports a clear error when --location-names doesn't match the architecture's requirement.

Previously, supplying --location-names with the wrong number of names (for example, a single location for a BDR-Always-ON cluster using thebronze layout, which requires two) caused tpaexec configure to abort with an unhandled Python traceback. tpaexec configure now compares the supplied location names against the number of locations the chosen architecture and layout actually need, and reports a clear error if they differ. The change applies to the M1, BDR-Always-ON andPGD-Always-ON architectures.

Fixed an issue whereby PGD-X demanded too many hostnames in --hostnames-from.

Configuring a PGD-X cluster with --hostnames-from used to fail with "found only N/16 names matching …" whenever the supplied file held fewer than 16 names, even when the cluster only needed a handful. The underlying PGD-X architecture now computes the real number of instances it will build (data + witness + barman nodes per location, plus an optional witness-only location and a pemserver if requested), so --hostnames-from accepts a correctly-sized file and no longer requires padding it out to 16 entries.

Fixed an issue whereby CAMO configuration was incorrectly added to config.yml.

When configuring a PGD-X cluster with --data-nodes-per-location 2 without explicitly enabling CAMO via --enable-camo, the generated config.yml would incorrectly include CAMO commit scopes and partner assignments. This resulted in unexpected CAMO configuration being applied to clusters that did not request it. CAMO configuration is now only generated when --enable-camo is explicitly passed to tpaexec configure.

59694
Fixed an issue whereby PGD 5/6 minor and 5-to-6 major upgrades occasionally left nodes fenced.

During a node-by-node upgrade, the "Wait for write leader elections to complete" task could time out because nodes that had been fenced earlier in the loop were never unfenced: the unfence step did not always see the node's updated state and was wrongly skipped, leaving the node fenced. TPA now waits for the node to rejoin Raft consensus and refreshes its state before unfencing, so nodes are reliably unfenced and the upgrade can proceed.

Improved error output for --primary-location in tpaexec configure.

An error in the argument parsing logic between the --primary-location and the--location-names arguments when the primary location is not found in the list of locations caused a built-in Python error to be thrown, which could be misleading to users, as it is not a TPA-related error. This fix properly handles the mismatch between the provided primary location and the list of known locations, avoiding confusing error messages.

Fixed an issue whereby tpaexec configure crashed when --overrides-from was supplied.

Running tpaexec configure with --overrides-from <file> aborted with "An error was encountered during execution of tpaexec configure: name 'reduce' is not defined" and no cluster directory was produced.tpaexec configure --overrides-from now completes successfully and the values from the supplied YAML file are merged into the generated cluster configuration as documented.

tpaexec switch2cm now works when ansible_user is not root.

Three plays in the PGD-Always-ON switch2cm command did not declare become at the play level, so the command only worked when ansible_user was root. With a non-root ansible_user it failed with "Permission denied" on /pgdata/data/conf.d.

Those plays now explicitly become root, matching the pattern used elsewhere in PGD-Always-ON. Per-task become overrides for SQL queries that run as the postgres user are unaffected.

TPA now configures max_active_replication_origins for PostgreSQL 18+ PGD clusters.

TPA now sets the max_active_replication_origins parameter for PGD clusters on PostgreSQL 18 and above, using "3 * number_of_nodes + 3" — a safety-margin formula above the "3 origins per peer node" minimum recommended by the EDB PGD documentation. This prevents PGD node join failures caused by the default value being too low for multi-node clusters.

TPA now sets path_prefix per backup server in the barman configuration.

Barman's path_prefix was previously set only as a global value in barman.conf, using the postgres binary directory of the barman host itself. This caused incorrect behaviour in mixed-version scenarios (during rolling upgrades, or when the barman host runs a different PostgreSQL major version than the nodes it backs up).

path_prefix is now also written per backup server in /etc/barman.d/<backup>.conf, resolving from the backed-up node's own postgres_bin_dir. This ensures barman uses the correct client binaries for each node it backs up. The global path_prefix in barman.conf is unchanged and continues to serve as the default for same-version scenarios.

When the backed-up node's PostgreSQL version differs from the barman host's, the matching client packages are now installed on the barman host automatically so the per-server path_prefix resolves to real binaries. The barman host must have repository access for the additional PostgreSQL version (typically the case with EDB enterprise repositories).

An explicit per-instance override is also available by setting barman_path_prefix in a node's vars in config.yml.

Deprecations

DescriptionAddresses
Removed the deprecated --cohost-proxies and --add-proxy-nodes-per-location configure flags for PGD-X.

The --cohost-proxies and --add-proxy-nodes-per-location flags in PGD-X have now been removed and will no longer be valid options in tpaexec configure for the PGD-X architecture. In PGD 6 all proxies are cohosted, making both options obsolete.