Table filters v7
Table filters specify the selection criteria for rows in publication tables or views to include during replications to subscriptions from the publication database in a single-master replication system. These table filters also apply between primary nodes in a multi-master replication system. Rows that don't satisfy the selection criteria are excluded from replications to subscriptions or primary nodes on which these table filters were enabled.
Implementing table filters
Implementing table filters is a two-part process. First, you must define a set of available table filters. You can define this set while creating the publication by defining specific, named rules that apply to selected publication tables or views expressed in the form of SQL WHERE
clauses.
While simple conditions with comparison and logical operators (=, >, <, LIKE, etc.) are allowed as filters for all use cases, in one use case you can use a more complex condition as a filter. You can use a subquery to establish an inclusion/exclusion list of IDs if a single column is returned from the subquery. This technique is supported only for trigger-based SMR use cases. Here are the patterns for this type of query:
For example, for tables dept
and dept_filter_lookup
created with:
You can use the following filters:
or
Use these filters to create a publication with the following RepCLI commands:
or
After you define a set of available table filters, you must enable them only on those subscription tables of either:
- A single-master replication system
- Primary node tables of a multi-master replication system where filtering is to occur during replication to those particular target tables.
No filtering occurs during replication to a target subscription table or primary node table if you didn't enable filters specifically on that table in the subscription or primary node.
We strongly recommend that you perform a snapshot replication to the subscriptions or primary nodes that contain tables on which the filtering criteria changed by adding filter rules, removing filter rules, or modifying existing filter rules.
A snapshot ensures that the content of the subscription tables or primary node tables is consistent with the updated filtering criteria.
Note
When using table filters in a multi-master replication system, the primary definition node, which provides the source of the table content for a snapshot, must contain a superset of all the data contained in the other primary nodes of the multi-master replication system. This configuration ensures that the target of a snapshot receives all of the data that satisfies any filtering criteria enabled on the other primary nodes.
Conversely, if the primary definition node contains only a subset of all the data contained in the other primary nodes, then a snapshot to another primary node might not result in the complete set of data that is required for that target primary node.
Effects of table filtering
A filter enabled on a table affects only the results from snapshot or synchronization replications targeted to that table by Replication Server. Filtering has no effect on changes made directly on the target table by external user applications such as an SQL command line utility.
Filtering has the following effects on a targeted, filtered table.
Note
In the following discussion, a result set refers to the set of rows in a table satisfying the selection criteria of an UPDATE
or DELETE
statement executed on that table.
In a snapshot replication, if the row satisfies the filtering criteria, a row from the source table of the snapshot is inserted into the target table. Otherwise the row is excluded from insertion into the target table.
When an INSERT
statement is executed on a source table followed by a synchronization replication, if the row satisfies the filtering criteria, the row is inserted into the target table of the synchronization. Otherwise the row is excluded from insertion into the target table.
When an UPDATE
or DELETE
statement executes on a source table followed by a synchronization replication, values of the identity columns are used in action on the target table of the synchronization. For non-PostgreSQL/EDB Postgres Advanced Server databases, the primary key columns are the identity columns. For PostgreSQL/EDB Postgres Advanced Server databases, the primary key or unique columns are the identity columns (see Design considerations for details).
When an UPDATE
statement executes on a source table followed by a synchronization replication, the UPDATE
result set of the source table determines the action on the target table of the synchronization as follows:
- If a row in the result set has no corresponding row in the target table with the same identity columns values and the updated row in the result set satisfies the filtering criteria, then the row is inserted into the target table. A row that previously didn't exist in the target table is added because the updated row in the source table now satisfies the filtering criteria.
- If a row in the result set has a corresponding row in the target table with the same identity columns values and the updated row in the result set satisfies the filtering criteria, then the row in the target table is updated accordingly. The update is applied to an existing, matching row in the target table that still satisfies the filtering criteria after the update.
- If a row in the result set has a corresponding row in the target table with the same identity columns values and the updated row in the result set no longer satisfies the filtering criteria, then the corresponding row in the target table is deleted. An existing, matching row in the target table no longer satisfies the filtering criteria after the update, so the row is removed from the target table.
When a DELETE
statement executes on a source table followed by a synchronization replication, the DELETE
result set of the source table determines the action on the target table of the synchronization as follows:
- If a row in the result set has a corresponding row in the target table with the same identity columns values, then the row with that primary key value is deleted from the target table. An existing, matching row in the target table is removed.
- If a row in the result set has no corresponding row in the target table with the same identity columns values, then no action is taken on the target table for that row. Because there's no existing, matching row in the target table, there's no row to remove from the target table.
Thus, regardless of whether the transaction on the source table is an INSERT
, UPDATE
, or DELETE
statement, the goal of a table filter is to ensure that all rows in the target table satisfy the filter rule.
Table settings and restrictions for table filters
For table filters, you can apply specific table settings and restrictions.
REPLICA IDENTITY setting for filtering in a log-based replication system
For replication systems using the log-based method of synchronization replication, a publication table on which you define a filter must have the REPLICA IDENTITY
option set to FULL
.
Note
This REPLICA IDENTITY FULL
setting isn't required for tables in single-master, snapshot-only publications, See Snapshot-only publications.
Use the ALTER TABLE
command for this setting:
ALTER TABLE schema.table_name REPLICA IDENTITY FULL
For additional information see the ALTER TABLE SQL command in the PostgreSQL Core documentation.
For example, for a publication table named edb.dept
, use the following ALTER TABLE
command:
ALTER TABLE edb.dept REPLICA IDENTITY FULL;
You can see the REPLICA IDENTITY
setting by using the \d+
command in the PSQL
utility:
The REPLICA IDENTITY FULL
setting is required on tables in the following databases of a log-based replication system:
- In a single-master replication system, you define table filters in the primary database. Thus, the publication tables in the primary database requiring filter definitions must be altered to a
REPLICA IDENTITY FULL
setting but only if the publication is not a snapshot-only publication. See Snapshot-only publications. - In a multi-master replication system, table filters are defined in the primary definition node. Thus, publication tables in the primary definition node requiring filter definitions must be altered to a
REPLICA IDENTITY FULL
setting. - In a multi-master replication system, don't set non-MDN nodes tables
REPLICA IDENTITY
options toFULL
unless transactions are expected to be targeted on those non-MDN nodes. In addition, the transactions will be filtered when they're replicated to the other primary nodes.
The REPLICA IDENTITY FULL
setting on a source table ensures that certain types of transactions on the source table result in the proper updates to the target tables on which filters are enabled.
Note
In addition to table filtering requirements, you might need the REPLICA IDENTITY FULL
on publication tables for other reasons in Replication Server. See Configuration parameter and table setting requirements for additional requirements.
Filtering restrictions on data types
Table filters aren't supported on binary data type columns. A binary data type is the Postgres data type BYTEA
. In addition, table filters aren't supported on EDB Postgres Advanced Server columns with data types BINARY
, VARBINARY
, BLOB
, LONG RAW
, and RAW
, as these are alias names for the BYTEA
data type.
Filtering restrictions on operators
Replication Server supports modulus operator (denoted by %) to define a filter clause. However, the following restrictions apply:
You can have only a single filter condition using the modulus operator.
You can't use it with any other conditions using
AND
orOR
operators.
Replication Server supports the modulus filter in the following formats:
deptno%3=0
@deptno%3=0
Roadmap for further instructions
The specific details on implementing table filtering depend upon whether you're using a single-master replication system or a multi-master replication system. Use this roadmap to find relevant information for each type of replication system.
For using table filters in a single-master replication system, see:
- Adding a publication for information on defining the initial set of table filters available for selective enablement on subscriptions
- Adding a subscription for information on enabling available table filters on a newly created subscription
- Updating the set of available table filters in a publication for information on adding, removing, or modifying rules comprising the set of available table filters
- Enabling/disabling table filters on a subscription for information on changing the table filters enabled on an existing subscription
For using table filters in a multi-master replication system see:
- Adding a publication for information on defining the initial set of table filters available for selective enablement on primary nodes
- Creating additional primary nodes for information on enabling available table filters on a newly created primary node
- Updating the set of available table filters in a publication for information on adding, removing, or modifying rules comprising the set of available table filters
- Enabling/disabling table filters on a subscription for information on changing table filters enabled on an existing primary node