Reconciling changes made outside of TPA v23
Any changes made to a TPA created cluster that are not performed by
changing the TPA configuration will not be saved in
means that your cluster will have changes that the TPA configuration
won't be able to recreate.
This page shows how configuration is managed with TPA and the preferred ways to make configuration changes. We then look at strategies to make, and reconcile, the results of making manual changes to the cluster.
The most common scenario in which you may need to make configuration changes outside of TPA is if the operation you are performing is not supported by TPA. The two most common such operations are destructive changes, such as removing a node, and upgrading the major version of Postgres.
In general TPA will not remove previously deployed elements of a
cluster, even if these are removed from
config.yml. This sometimes
surprises people because a strictly declarative system should always
mutate the deployed artifacts to match the declaration. However, making
destructive changes to production database can have serious consequences
so it is something we have chosen not to support.
TPA does not yet provide an automated mechanism for performing major version upgrades of Postgres. Therefore if you need to perform an in-place upgrade on an existing cluster this must be performed using other tools such as pg_upgrade or bdr_pg_upgrade.
A general issue with unreconciled changes is that if you deploy a new
cluster using your existing
config.yml, or provide your
EDB Support in order to reproduce a problem, it will not match the
original cluster. In addition, there is potential for operational
problems should you wish to use TPA to manage that cluster in future.
The operational impact of unreconciled changes varies depending on the
nature of the changes. In particular whether the change is destructive,
and whether the change blocks TPA from running by causing an error or
invalidating the data in
Additive changes are often accommodated with no immediate operational
issues. Consider manually adding a user. The new user will continue to
exist and cause no issues with TPA at all. You may prefer to manage the
user through TPA in which case you can declare it in
the existence of a manually-added user will cause no operational issues.
Some manual additions can have more nuanced effects. Take the example of
an extension which has been manually added. Because TPA does not make
destructive changes, the extension will not be removed when
deploy is next run. However, if you made any changes to the
Postgres configuration to accommodate the new extension these may be
overwritten if you did not make them using one of TPA's supported
mechanisms (see below).
Furthermore, TPA will not make any attempt to modify the
file to reflect manual changes and the new extension will be omitted
tpaexec upgrade which could lead to incompatible software
versions existing on the cluster.
Destructive changes that are easily detected and do not block TPA's
operation will simply be undone when
tpaexec deploy is next run.
Consider manually removing an extension. From the perspective of TPA,
this situation is indistinguishable from the user adding an extension to
config.yml file and running deploy. As such, TPA will add the
extension such that the cluster and the
config.yml are reconciled,
albeit in the opposite way to that the user intended.
Similarly, changes made manually to configuration parameters will be undone unless they are:
- Made in the
conf.d/9999-override.conffile reserved for manual edits;
- Made using
ALTER SYSTEMSQL; or
- Made natively in TPA by adding
Other than the fact that option 3 is self-documenting and portable, there is no pressing operational reason to reconcile changes made by method 1 or 2.
Changes which create a more fundamental mismatch between
can block TPA from performing operations. For example if you physically
remove a node in a bare metal cluster, attempts by TPA to connect to
that node will fail, meaning most TPA operations will exit with an error
and you will be unable to manage the cluster with TPA until you
reconcile this difference.
In general, the reconciliation process involves modifying
such that it describes the current state of the cluster and then running
Deploy a minimal PGD cluster using the bare architecture and a configure command such as:
Part a node using this SQL, which can be executed from any node:
select * from bdr.part_node('node-2');
deploy. Note that, whilst no errors occur, the node is still
parted. This can be verified using the command
pgd show-nodes on any
of the nodes. This is because TPA will not overwrite the metadata which
tells PGD the node is parted.
It is not possible to reconcile the
config.yml with this cluster state
because TPA, and indeed PGD itself, has no mechanism to initiate a node
in the 'parted' state. In principle you could continue to use TPA to
continue this parted cluster, but this is not advisable. In most cases
you will wish to continue to fully remove the node and reconcile
The previous example parted a node from the PGD cluster, but left the node itself intact and still managed by TPA in a viable but unreconcilable state.
To completely decommission the node, it is safe to simply turn off the
server corresponding to
node-2. If you attempt to run
deploy at this
stage, it will fail early when it cannot reach the server.
To reconcile this change in
config.yml simply delete the entry under
instances corresponding to
node-2. It will look something like this:
You can now manage this node as usual using TPA. However, the original
cluster still has metadata that refers to
node-2 so to complete
reconciliation it is recommended to run the following SQL on each node
to remove the metadata. This step is essential if you wish to add a
node of the same name in future.
If you wish to join the original
node-2 back to the cluster after
removing it in this way, you can do so simply by restoring the deleted
config.yml but you must ensure that
select * from
bdr.drop_node('node-2'); has been run on this node and that the PGDATA
directory has been deleted.
TPA automatically generates a password for the superuser which you may
tpaexec show-password <cluster> <superuser-name>. If you
change the password manually (for example using the
in psql) you will find that after
tpaexec deploy is next run, the
password has reverted to the one set by TPA. To make the change through
TPA, and therefore make it persist across runs of
tpaexec deploy, you
must use the command
tpaexec store-password <cluster> <superuser-name>
to specify the password, then run
tpaexec deploy. This also applies to
any other user created through TPA.
A simple single-node cluster can be deployed with the following
You may manually add the pgvector extension by connecting to the node
apt install postgresql-15-pgvector then executing the
following SQL command:
CREATE EXTENSION vector;. This will not cause
any operational issues, beyond the fact that
config.yml no longer
describes the cluster as fully as it did previously. However, it is
advisable to reconcile
config.yml (or indeed simply use TPA to add the
extension in the first place) by adding the following cluster variables.
After adding this configuration, you may manually remove the extension
by executing the SQL command
DROP EXTENSION vector; and then
apt remove postgresql-15-pgvector. However if you run
again without reconciling
config.yml, the extension will be
reinstalled. To reconcile
config.yml, simply remove the lines added
As noted previously, TPA will not honour destructive changes.
So simply removing the lines from
config.yml will not remove the
extension. It is necessary to perform this operation manually then
reconcile the change.