Table filters v7

Table filters specify the selection criteria for rows in publication tables or views to include during replications to subscriptions from the publication database in a single-master replication system. These table filters also apply between primary nodes in a multi-master replication system. Rows that don't satisfy the selection criteria are excluded from replications to subscriptions or primary nodes on which these table filters were enabled.

Implementing table filters

Implementing table filters is a two-part process. First, you must define a set of available table filters. You can define this set while creating the publication by defining specific, named rules that apply to selected publication tables or views expressed in the form of SQL WHERE clauses.

While simple conditions with comparison and logical operators (=, >, <, LIKE, etc.) are allowed as filters for all use cases, in one use case you can use a more complex condition as a filter. You can use a subquery to establish an inclusion/exclusion list of IDs if a single column is returned from the subquery. This technique is supported only for trigger-based SMR use cases. Here are the patterns for this type of query:

PUBLISHED_TABLE_COLUMN_NAME IN (SELECT COLUMN_NAME FROM TABLE_NAME)

PUBLISHED_TABLE_COLUMN_NAME NOT IN (SELECT COLUMN_NAME FROM TABLE_NAME)

For example, for tables dept and dept_filter_lookup created with:

CREATE TABLE public.dept (

deptno NUMERIC(2) PRIMARY KEY,

dname VARCHAR(14) UNIQUE,

loc VARCHAR(13)

);

CREATE TABLE public.dept_filter_lookup (

id NUMERIC(2) PRIMARY KEY

);

You can use the following filters:

deptno IN (SELECT id FROM public.dept_filter_lookup)

or

deptno NOT IN (SELECT id FROM public.dept_filter_lookup)

Use these filters to create a publication with the following RepCLI commands:

$ java -jar edb-repcli.jar -createpub  pub -repsvrfile ~/pubsvrfile -pubdbid 1 -tables public.dept -reptype T \

-tablesfilterclause "public.dept:lookup_filter:deptno IN (SELECT id FROM public.dept_filter_lookup)"

or

$ java -jar edb-repcli.jar -createpub  pub -repsvrfile ~/pubsvrfile -pubdbid 1 -tables public.dept -reptype T \

-tablesfilterclause "public.dept:lookup_filter:deptno NOT IN (SELECT id FROM public.dept_filter_lookup)"

After you define a set of available table filters, you must enable them only on those subscription tables of either:

  • A single-master replication system
  • Primary node tables of a multi-master replication system where filtering is to occur during replication to those particular target tables.

No filtering occurs during replication to a target subscription table or primary node table if you didn't enable filters specifically on that table in the subscription or primary node.

We strongly recommend that you perform a snapshot replication to the subscriptions or primary nodes that contain tables on which the filtering criteria changed by adding filter rules, removing filter rules, or modifying existing filter rules.

A snapshot ensures that the content of the subscription tables or primary node tables is consistent with the updated filtering criteria.

Note

When using table filters in a multi-master replication system, the primary definition node, which provides the source of the table content for a snapshot, must contain a superset of all the data contained in the other primary nodes of the multi-master replication system. This configuration ensures that the target of a snapshot receives all of the data that satisfies any filtering criteria enabled on the other primary nodes.

Conversely, if the primary definition node contains only a subset of all the data contained in the other primary nodes, then a snapshot to another primary node might not result in the complete set of data that is required for that target primary node.

Effects of table filtering

A filter enabled on a table affects only the results from snapshot or synchronization replications targeted to that table by Replication Server. Filtering has no effect on changes made directly on the target table by external user applications such as an SQL command line utility.

Filtering has the following effects on a targeted, filtered table.

Note

In the following discussion, a result set refers to the set of rows in a table satisfying the selection criteria of an UPDATE or DELETE statement executed on that table.

In a snapshot replication, if the row satisfies the filtering criteria, a row from the source table of the snapshot is inserted into the target table. Otherwise the row is excluded from insertion into the target table.

When an INSERT statement is executed on a source table followed by a synchronization replication, if the row satisfies the filtering criteria, the row is inserted into the target table of the synchronization. Otherwise the row is excluded from insertion into the target table.

When an UPDATE or DELETE statement executes on a source table followed by a synchronization replication, values of the identity columns are used in action on the target table of the synchronization. For non-PostgreSQL/EDB Postgres Advanced Server databases, the primary key columns are the identity columns. For PostgreSQL/EDB Postgres Advanced Server databases, the primary key or unique columns are the identity columns (see Design considerations for details).

When an UPDATE statement executes on a source table followed by a synchronization replication, the UPDATE result set of the source table determines the action on the target table of the synchronization as follows:

  • If a row in the result set has no corresponding row in the target table with the same identity columns values and the updated row in the result set satisfies the filtering criteria, then the row is inserted into the target table. A row that previously didn't exist in the target table is added because the updated row in the source table now satisfies the filtering criteria.
  • If a row in the result set has a corresponding row in the target table with the same identity columns values and the updated row in the result set satisfies the filtering criteria, then the row in the target table is updated accordingly. The update is applied to an existing, matching row in the target table that still satisfies the filtering criteria after the update.
  • If a row in the result set has a corresponding row in the target table with the same identity columns values and the updated row in the result set no longer satisfies the filtering criteria, then the corresponding row in the target table is deleted. An existing, matching row in the target table no longer satisfies the filtering criteria after the update, so the row is removed from the target table.

When a DELETE statement executes on a source table followed by a synchronization replication, the DELETE result set of the source table determines the action on the target table of the synchronization as follows:

  • If a row in the result set has a corresponding row in the target table with the same identity columns values, then the row with that primary key value is deleted from the target table. An existing, matching row in the target table is removed.
  • If a row in the result set has no corresponding row in the target table with the same identity columns values, then no action is taken on the target table for that row. Because there's no existing, matching row in the target table, there's no row to remove from the target table.

Thus, regardless of whether the transaction on the source table is an INSERT, UPDATE, or DELETE statement, the goal of a table filter is to ensure that all rows in the target table satisfy the filter rule.

Table settings and restrictions for table filters

For table filters, you can apply specific table settings and restrictions.

REPLICA IDENTITY setting for filtering in a log-based replication system

For replication systems using the log-based method of synchronization replication, a publication table on which you define a filter must have the REPLICA IDENTITY option set to FULL.

Note

This REPLICA IDENTITY FULL setting isn't required for tables in single-master, snapshot-only publications, See Snapshot-only publications.

Use the ALTER TABLE command for this setting:

ALTER TABLE schema.table_name REPLICA IDENTITY FULL

For additional information see the ALTER TABLE SQL command in the PostgreSQL Core documentation.

For example, for a publication table named edb.dept, use the following ALTER TABLE command:

ALTER TABLE edb.dept REPLICA IDENTITY FULL;

You can see the REPLICA IDENTITY setting by using the \d+ command in the PSQL utility:

edb=# \d+ edb.dept

                                  Table "edb.dept"
 Column |         Type          | Modifiers | Storage  | Stats target | Description
--------+-----------------------+-----------+----------+--------------+-------------
 deptno | numeric(2,0)          | not null  | main     |              |
 dname  | character varying(14) |           | extended |              |
 loc    | character varying(13) |           | extended |              |
Indexes:
    "dept_pk" PRIMARY KEY, btree (deptno)
    "dept_dname_uq" UNIQUE CONSTRAINT, btree (dname)
Referenced by:
    TABLE "emp" CONSTRAINT "emp_ref_dept_fk" FOREIGN KEY (deptno) REFERENCES dept(deptno)
    TABLE "jobhist" CONSTRAINT "jobhist_ref_dept_fk" FOREIGN KEY (deptno) REFERENCES dept(deptno) ON DELETE SET NULL
Replica Identity: FULL

The REPLICA IDENTITY FULL setting is required on tables in the following databases of a log-based replication system:

  • In a single-master replication system, you define table filters in the primary database. Thus, the publication tables in the primary database requiring filter definitions must be altered to a REPLICA IDENTITY FULL setting but only if the publication is not a snapshot-only publication. See Snapshot-only publications.
  • In a multi-master replication system, table filters are defined in the primary definition node. Thus, publication tables in the primary definition node requiring filter definitions must be altered to a REPLICA IDENTITY FULL setting.
  • In a multi-master replication system, don't set non-MDN nodes tables REPLICA IDENTITY options to FULL unless transactions are expected to be targeted on those non-MDN nodes. In addition, the transactions will be filtered when they're replicated to the other primary nodes.

The REPLICA IDENTITY FULL setting on a source table ensures that certain types of transactions on the source table result in the proper updates to the target tables on which filters are enabled.

Note

In addition to table filtering requirements, you might need the REPLICA IDENTITY FULL on publication tables for other reasons in Replication Server. See Configuration parameter and table setting requirements for additional requirements.

Filtering restrictions on data types

Table filters aren't supported on binary data type columns. A binary data type is the Postgres data type BYTEA. In addition, table filters aren't supported on EDB Postgres Advanced Server columns with data types BINARY, VARBINARY, BLOB, LONG RAW, and RAW, as these are alias names for the BYTEA data type.

Filtering restrictions on operators

Replication Server supports modulus operator (denoted by %) to define a filter clause. However, the following restrictions apply:

  • You can have only a single filter condition using the modulus operator.

  • You can't use it with any other conditions using AND or OR operators.

Replication Server supports the modulus filter in the following formats:

deptno%3=0

@deptno%3=0

Roadmap for further instructions

The specific details on implementing table filtering depend upon whether you're using a single-master replication system or a multi-master replication system. Use this roadmap to find relevant information for each type of replication system.

For using table filters in a single-master replication system, see:

For using table filters in a multi-master replication system see: