2.2 Replication Concepts and Definitions
2 Overview : 2.2 Replication Concepts and Definitions
xDB Replication Server is a software product that enables the implementation of a replication system. A replication system is software and hardware whose purpose is to make a copy of data from one location to another and to ensure the copied data is the same as the original over time.xDB Replication Server applies the replication system concept to tables of Oracle, SQL Server, PostgreSQL, and Advanced Server database management systems.The following sections present specific terms and concepts used when discussing xDB Replication Server.
• Single-Master Replication (SMR). Changes (inserts, updates, and deletions) to table rows are allowed to occur in a designated master database. These changes are replicated to tables in one or more slave databases. The replicated tables in the slave databases are not permitted to accept any changes except from its designated master database. (This is also known as master-to-slave replication.)
• Multi-Master Replication (MMR). Two or more databases are designated in which tables with the same table definitions and initial row sets are created. Changes (inserts, updates, and deletions) to table rows are allowed to occur in any database. Changes to table rows in any given database are replicated to their counterpart tables in every other database.
• Replication from Oracle to PostgreSQL
• Replication in either direction between Oracle and Advanced Server
• Replication in either direction between SQL Server and PostgreSQL
• Replication in either direction between SQL Server and Advanced ServerFor multi-master replication, the participating database servers in a given multi-master replication system must be of the same type:Note: A given database cannot simultaneously participate in both a single-master replication system and a multi-master replication system.xDB Replication Server uses an architecture called publish and subscribe. The data to be made available for copying by a replication system is defined as a publication. To get a copy of that data, you must “subscribe” to that publication. The manner in which you subscribe is slightly different for single-master and multi-master replication systems.In xDB Replication Server a publication is defined as a named set of tables and views within a database. The database that contains the publication is called the publication database of that publication.In a single-master replication system, to get a copy of an xDB Replication Server publication, you must create a subscription. An xDB Replication Server subscription is a named association of a publication to a database to which the publication is to be copied. This database is called the subscription database.Similar to a single-master replication system, when creating a multi-master replication system, you first define a publication in the publication database. You then add one or more additional databases that you want to participate in this multi-master replication system. As you add each database, it is associated with this replication system. You do not create an explicit, named subscription in a multi-master replication system.In a single-master replication system, replication is said to occur when xDB Replication Server initiates and completes either of the following processes: 1) applies changes that have been made to rows in the publication since the last replication occurred, to rows in tables of the subscription database (called synchronization); or 2) copies rows of the publication to empty tables of the subscription database (called a snapshot). See Section 2.2.6 for further discussion on snapshots and synchronization.The subscription tables are the tables in the subscription database created from corresponding tables or views in the publication.Note: In a single-master replication system xDB Replication Server creates a table in the subscription database for each view contained in the publication.In a multi-master replication system, the concept and definition of replication is nearly identical to a single-master replication system with the following modifications: 1) synchronization can occur between any pair of databases (referred to as master nodes) participating in the replication system; and 2) a snapshot can occur from the publication database designated as the master definition node to any of the other master nodes.The preceding diagram illustrates that a table that has been created as a member of a subscription can be used in a publication replicating to another subscription. This scenario is called cascading replication.xDB Replication Server performs master-to-slave replication when a single-master replication system is implemented. The publication is the master and the subscription is the slave. In a master-to-slave relationship, changes are propagated in one direction only, from the master to the slave.Generally, changes must not be made to the definitions of the publication tables or the subscription tables. If such changes are made to the publication tables, they are not propagated to the subscription and vice versa unless the DDL change replication feature is used as described in Section 7.8. If changes are made to the table definitions without using the DDL change replication feature, there is a risk that future replication attempts may fail.Changes must not be made to the rows of the subscription tables. If such changes are made, they are not propagated back to the publication. If changes are made to the subscription table rows, it is fairly likely that the rows will no longer match their publication counterparts. There is also a risk that future replication attempts may fail.2.2.4 Multi-Master ReplicationAs an alternative to the single-master (master-to-slave) replication model, xDB Replication Server supports multi-master replication.A master node is a database participating in a multi-master replication system.The database (master node) in which the publication is initially defined is specially designated as the master definition node (MDN). There can be only one master definition node at any given time, however, it is possible to change which master node is the master definition node. When it is important to make a distinction between the master definition node and all other master nodes that are not the master definition node, the latter are referred to as non-MDN nodes.
• Once the multi-master replication system is defined, changes (inserts, updates, and deletions) to rows of the publication tables on any master node are synchronized to all other master nodes on either an on demand or scheduled basis.Generally, changes must not be made to the table definitions in any of the master nodes including the master definition node. If such changes are made, they are not propagated to other nodes in the multi-master replication system unless they are made using the DDL change replication feature described in Section 7.8. If changes are made to tables without using the DDL change replication feature, there is a risk that future replication attempts may fail.2.2.5 AsynchronousxDB Replication Server performs replications asynchronously. The systems hosting the databases do not always have to be running continuously in order for successful replication to occur. If one system goes offline, replication resumes when it comes back online if there is still pending data to replicate.In addition you can create a schedule for your replication system. xDB Replication Server initiates and performs replications regularly according to the assigned schedule. This allows you to run the replication system unattended. See Section 7.2 for directions on creating a schedule.xDB Replication Server performs two different types of replications. These two main types are called snapshot replication and synchronization replication.In either method, the source tables refer to the tables from which the replication data is originating (the publication in a single-master replication system, or the master node whose changes are being replicated to another master node in a multi-master replication system).The target tables are the tables that are receiving the replication data from the source tables (the subscription tables in a single-master replication system, or the master node receiving changes from another master node in a multi-master replication system).In snapshot replication, all existing rows in the target tables are deleted using the database system’s TRUNCATE command. The tables are then completely reloaded from the source tables of the publication.In synchronization replication, only the changes (inserts, updates, and deletions) to the rows in the source tables since the last replication are applied to the target tables.Note: Deletion of all rows in a source table executed by the SQL TRUNCATE command results in replication to the target tables only if the log-based method of synchronization replication is used. If the trigger-based method of synchronization replication is used, execution of the TRUNCATE command on a source table does not replicate the effect to the target tables. You must perform a snapshot from the source table to the target tables if the trigger-based method is used. (The difference between the trigger-based method and the log-based method is discussed as follows.)Synchronization replication is implemented using two different methods – the trigger-based method and the log-based method.In the trigger-based method changes to rows in the source tables result in the firing of row-based triggers. These triggers record the changes in shadow tables. The changes recorded in the shadow tables are then periodically extracted from the shadow tables, converted to an in-memory data structure, and applied to the target tables by means of SQL statements executed using JDBC. See Section 2.2.9 for information on the trigger-based method.In the log-based method changes to rows in the source tables are extracted from the Write-Ahead Log segments (WAL files) using asynchronous streaming replication implemented by the logical decoding feature available in Postgres database servers. The extracted changes are converted to an in-memory data structure and applied to the target tables by means of SQL statements executed using JDBC. See Section 2.2.10 for information on the log-based method.In a multi-master replication system, the manner in which changes accumulated on all master nodes are replicated to all other master nodes is conceptually done in groups identified by the source master node with the changes to be replicated. See Section 2.2.11 for information on this process and the improvement for the log-based method with parallel replication.In a single-master replication system, the very first replication to a newly created subscription must always be done by a snapshot. Subsequent replications can be done by snapshot or by synchronization provided that the publication is not defined as a snapshot-only publication as discussed in Section 2.2.7.In a multi-master replication system, the very first replication from the master definition node to a newly added master node must always be done by a snapshot. Subsequent replications between master nodes occur by synchronization. However, it is possible to perform subsequent snapshots from the master definition node to any other master node.When a publication is created in a single-master replication system, the publication can be defined as a snapshot-only publication. Replication from a snapshot-only publication can only be done using the snapshot replication method. Synchronization replication is not permitted on a snapshot-only publication.2.2.8 Snapshot ReplicationIn snapshot replication, the target tables are completely reloaded from the source tables. The database system’s truncate operation is used to delete all rows from the target tables.For Oracle and SQL Server only: Oracle and SQL Server target tables are loaded using JDBC batches of INSERT statements.For Postgres only: In general, Postgres target tables are loaded using the JDBC COPY command since using truncation and COPY is generally faster than if you were to execute an SQL DELETE statement against the entire table and then add the rows using JDBC batches of INSERT statements. If the COPY command fails, the publication server retries the snapshot using JDBC batches of INSERT statements.If the target table (regardless of database type) contains a large object data type such as BYTEA, BLOB, or CLOB then rows are loaded one at a time per batch using an INSERT statement. This is to avoid a heap space error resulting from potentially large rows. Loading time can be decreased by allowing multiple inserts per batch, which is done by adjusting the configuration option lobBatchSize described in Section 5.8.1.Note: Advanced Server supports a number of aliases for data types. Such aliases that translate to BYTEA are treated as large object data types. See the Database Compatibility for Oracle Developers Reference Guide for a listing of Advanced Server data types. (See the Database Compatibility for Oracle Developer’s Guide for Advanced Server version 9.5 or earlier versions.)Under certain circumstances, the corresponding Postgres target table created for certain types of Oracle partitioned tables is a set of inherited tables. In these cases, the SQL DELETE statement is used on the inherited child tables instead of truncation. See Section 10.4.1.4 for additional information on replicating Oracle partitioned tables.A server configuration option is available that forces the snapshot replication process to use the Oracle database link utility instead of JDBC COPY to populate the Postgres target tables from an Oracle publication. Oracle database link provides an additional performance improvement over JDBC COPY. See Section 5.8.1 for information on using the Oracle database link option.See Section 5.8.1 for information on various configuration options to optimize snapshot replication.If a publication in a single-master replication system is created that will be used in synchronization replications with the trigger-based method, the publication server installs an insert trigger, an update trigger, and a delete trigger on each publication table. In a multi-master replication system, each replicated table in each master node employing the trigger-based method has an insert trigger, an update trigger, and a delete trigger.The publication server also creates a shadow table for each source table on which triggers have been created. A shadow table is a table used by xDB Replication Server to record the changes (inserts, updates, and deletions) made to a given source table. A shadow table records three types of record images: For each row inserted into the source table, the shadow table records the image of the inserted row. For each existing row that is updated in the source table, the shadow table records the after image of the updated row. For each row deleted from the source table, the shadow table records the primary key value of the deleted row.Note: In a multi-master replication system, the before image of an updated row is also stored in the shadow table in order to perform update conflict detection. See Section 6.6 for information on conflict detection in a multi-master replication system.After each change on the source table, one of the insert, update, or delete triggers is executed. These are row triggers, so for each row affected by the change, the trigger executes. Each execution of the trigger records a row of the appropriate type (insert, update, or deletion) in the shadow table of the corresponding source table.Though changes made to the source tables since the last replication occurred are applied to the target tables using SQL INSERT, UPDATE, and DELETE statements, the actual SQL statements run against the target tables are not the same SQL statements that were run against the source tables.When synchronization replication occurs, the publication server executes JDBC batches of SQL statements (also referred to as transaction sets) against the target tables. The batches contain an INSERT statement for each shadow table row recording an insert operation, an UPDATE statement for each shadow table row recording an update operation, and a DELETE statement for each shadow table row recording a delete operation. Each batch is executed in one transaction.Shadow table rows that were applied to target tables can be viewed as shadow table history in the xDB Replication Console (see Section 7.4.3).Note: A single SQL statement executed against a source table may result in many rows recorded in a shadow table, and therefore, many SQL statements executed against the target table. For example, if a single UPDATE statement affects 10 rows in the source table, 10 rows will be inserted into the shadow table – one for each row in the source table that was updated. When the publication server applies the changes to the target table, 10 UPDATE statements will be executed.Note: For greater efficiency, when changes to the source tables consist of SQL statements that each affect a large number of rows, the publication server may employ the use of prepared SQL statements. See Section 5.8.2 for directions on how to control the usage of prepared SQL statements as well as information on various other configuration options to optimize synchronization replication.In PostgreSQL 9.4 a feature has been introduced called logical decoding (also called logical replication or changeset extraction). This feature provides the capability to extract data manipulation language (DML) changes from the Write-Ahead Log segments (WAL files) in a readable format.For information on logical decoding see the PostgreSQL Core Documentation located at:The key significance of this feature is the ability to capture data changes to the publication tables without impacting the online transaction processing rate against these tables that occurs when using the trigger-based method. The trigger-based method results in the firing of row-level triggers whenever data changes occur, then inserting these data changes into shadow tables for temporary storage before applying the changes to the target databases.Thus, extracting data changes using logical decoding can be beneficial for improving database server throughput and replication latency.However, note that the logical decoding interface streams changes for all tables in a given database, which may have a performance overhead associated with it. For example, if a database contains 100 tables, and the user is interested in replicating only a small subset of these tables, say only 20 tables in a single publication, the logical decoding protocol will stream changes for all 100 tables to the publication server. The publication server eventually filters out the changes for the irrelevant 80 tables. However, this results in network overhead caused by the additional changeset load that is not required by the replication system.Using logical decoding to extract changes from a publication database during xDB synchronization replication is referred to as the log-based method.The following sections describe the basic requirements and concepts for the log-based method of synchronization replication.188.8.131.52 Requirements and RestrictionsThe following are the general requirements and restrictions when using the log-based method for any database of a single-master or multi-master replication system:
• The selection of either the trigger-based method or the log-based method is a characteristic applicable to only the publication database. The choice is made when defining the master database of a single-master replication system (see Section 5.2.2) or the master definition node of a multi-master replication system (see Section 6.2.2).
• In a single-master replication system, whether the master database uses the trigger-based method or the log-based method has no additional impact on the rules for choosing the subscription database as described in Section 10.1. For example, even if the log-based method is chosen for the master database, the subscription database may be running on Postgres version 9.4 as well as any supported, earlier version of Postgres, as well as Oracle or SQL Server as described in Section 10.1.
• Selection of the log-based method for any database impacts the configuration of the Postgres database cluster containing that database.If you plan to use the log-based method with any publication database running under a Postgres database server, the following configuration parameter settings are required in the configuration file, postgresql.conf, of that Postgres database server:
• wal_level. Set to logical.
• max_wal_senders. Specifies the maximum number of concurrent connections (that is, the maximum number of simultaneously running WAL sender processes). Set at minimum, to the total number of master databases of single-master replication systems and master nodes of multi-master replication systems on this database server that will use the log-based method.
• max_replication_slots. Specifies the maximum number of replication slots. If the database server supports both single-master replication systems and multi-master replication systems, then max_replication_slots must be set at minimum to the sum of the requirements for both replication systems. For support of SMR systems, the minimum requirement is the total number of master databases of the single-master replication systems that will use the log-based method. For support of MMR systems, the minimum requirement is the total number of master nodes in the multi-master replication system multiplied by the number of master nodes residing on this database server. For information, see Section 184.108.40.206.
• Also see Section 5.1.2 for setting these parameters for a single-master replication system. See Section 6.1.2 for a multi-master replication system.In addition, the pg_hba.conf configuration file of the Postgres database server must contain an entry permitting REPLICATION access for each database using the log-based method running on the database server. The access must be permitted to the publication database user specified when creating the publication database definition using the xDB Replication Console (see Section 5.2.2 for a single-master replication system or Section 6.2.2 for a multi-master replication system) or the xDB Replication Server Command Line Interface (CLI) (see Section 8.3.6).See Section 220.127.116.11 for setting REPLICATION access for a single-master replication system. See Section 6.1.5 for a multi-master replication system.For configuration options in the publication server configuration file that are specifically applicable to the log-based method see Section 10.4.1.15.18.104.22.168 Logical Replication SlotsWhen using the log-based method on a publication database, the underlying logical decoding framework exposes the data changes (the changeset stream) by means of a logical replication slot.A logical replication slot represents a changeset stream and applies to a single database. The xDB Replication Server assigns a unique identifier, called the slot name, to each logical replication slot it creates in the form xdb_dboid_pubid where dboid is the publication database object identifier (OID) and pubid is the publication ID assigned by the xDB Replication Server. All slot names are unique within a Postgres database cluster.Thus, for each single-master replication system using the log-based method, a replication slot is required for the publication database of each such system.For a multi-master replication system using the log-based method, each master node requires a replication slot.The maximum number of replication slots permitted for a database server is controlled by the max_replication_slots configuration parameter in the postgresql.conf file. Therefore this configuration parameter must be set to a large enough value to account for all publication databases defined with the log-based method of single-master replication systems running on the database server as well as all master nodes of a multi-master replication system defined with the log-based method running on the database server. Additional replication slots are required to support the usage of replication origin (see Section 22.214.171.124).See Section 5.1.2 for additional information on configuration parameters for single-master replication systems. See Section 6.1.2 for multi-master replication systems.The changeset stream is accessible to the xDB publication server by the WAL sender process (walsender) using the streaming replication protocol.The xDB publication server connects using the walsender interface through which changes are streamed on a continual basis. The continuous streaming eliminates the need for explicitly polling for changes.
1. A streaming replication connection to the database server is opened using libpq to establish a walsender communication channel.
2. A separate thread is used to monitor data changes streamed through the walsender interface.
4. On the next scheduled interval, the in-memory cached data changes are applied to each of the target databases in JDBC batches of SQL statements (referred to as transaction sets) in the same manner as described in Section 2.2.9 for the trigger-based method. If one or more target database servers are not accessible, the data changes are saved in a local file on the host running the publication server. Section 126.96.36.199 for information on in-memory caching and data persistence.
5. Note: A single SQL statement executed against a source table may result in many rows modified and returned in the changeset stream, and therefore, many SQL statements executed against the target table. For example, if a single UPDATE statement affects 10 rows in the source table, 10 rows will be returned in the changeset stream – one for each row in the source table that was updated. When the publication server applies the changes to the target table, 10 UPDATE statements will be executed.188.8.131.52 Replication OriginStarting with Postgres version 9.5, a feature called replication origin has been introduced to the logical decoding framework. Replication origin allows an application to identify, label, and mark certain aspects of a logical decoding session.For information on replication origin see the PostgreSQL Core Documentation located at:For the log-based method of synchronization replication, this provides performance improvement provided that the master nodes are running under Postgres version 9.5 or later.As previously described, the log-based method uses the WAL files to obtain the changes applied to the publication tables. After the changes are retrieved through the walsender interface, the publication server applies the set of changes to the other master nodes using transaction sets consisting of JDBC batches of SQL statements. When these changes are applied to the tables in the other target master nodes, the same changes are also recorded in the WAL files of each database server hosting the target master nodes.These redundant or “replayed” changes are included in the changeset stream received by the publication server. These replayed changes must be ignored and not applied since they are duplicates of all changes that have already been applied to the target tables through the JDBC batches.The replayed changes result in performance overhead as all such changes are transmitted over the network from the database server to the publication server, and then the publication server must discard such redundant changes.With the replication origin feature, the publication server is able to set up the logical decoding sessions so that these replayed changes are not included in the changeset stream transmitted over the network to the publication server thus eliminating this performance overhead.
• The max_replication_slots configuration parameter must be set at a certain minimal level to ensure that the publication server can create the additional replication slots for replication origin.For each master node database, in addition to the replication slot used for the changeset stream, an additional number of replication slots is required – one additional slot corresponding to every other master node to support the replication origin usage. Thus, for each master node, the total number of replication slots required is equal to the total number of master nodes in the entire MMR system.Therefore, for a given database server (that is, a Postgres database cluster containing master node databases), the total number of replication slots required is equal to the total number of master nodes in the entire MMR system multiplied by the number of master node databases residing within the given database cluster.For example, assume the usage of a 6-node multi-master replication system using three database clusters as follows:The total number of master nodes is six. Multiply the number of master node databases in each database cluster by six to give the required minimum setting for max_replication_slots for that database cluster.The following table shows the required, minimum settings for max_replication_slots as well as max_wal_senders.
If the max_replication_slots parameter is not set to a high enough value, synchronization replication still succeeds, but without the replication origin performance advantage.WARNING: Failed to setup replication origin xdb_mmrnode_c_emp_pub_6. Reason: ERROR: could not find free replication state slot for replication origin with OID 4The following example shows some of the replication slot information for a 3-master node system running on a single database cluster.The number should be sufficiently greater than the number of replication slots and replication origins currently allocated.The replication origin name is assigned in the format xdb_srcdbname_pubname_remotedbid where srcdbname is the source database name, pubname is the publication name, and remotedbid is the publication database ID of a remote database.The replication slots are in the active state when the publication server is running. The replication slots are deactivated when the publication server is shut down.The replication slots and replication origin sessions are deleted from the database cluster when their corresponding master nodes are removed from the multi-master replication system using the xDB Replication Console or the xDB Replication Server CLI.Should some situation occur where the replication slots are not properly deleted when required, see Section 10.3.4.4 for instructions on manually deleting them.184.108.40.206 In-Memory Caching and PersistenceThe data changes are fetched and stored in memory buffers to optimize the data replication process. This avoids the overhead associated with repeatedly fetching the same set of changes from the database server when there are multiple target databases.This approach is sufficient as long as all of the target databases are accessible during a replication event and the data fits within the available cache.However, if one or more of the target databases is unavailable due to network connectivity problems, server down time, etc. the in-memory data changes must be persisted for later retrieval when the target databases becomes available for synchronization with the source database.The xDB Replication Server architecture utilizes Java object serialization to persist the in-memory state of the data. Object serialization is the conversion of object data and other relevant information to a sequence of bytes that can then be stored in a file.
• The cache size corresponds to the heap size configured for the publication server by the -Xmxnnnm setting of the JAVA_HEAP_SIZE parameter in the xDB Startup Configuration file. See Section 220.127.116.11 for information on the xDB Startup Configuration file.The persistence I/O overhead can be minimized by increasing the heap size value and defining a more frequent synchronization interval such as for every few seconds. See Section 7.2 for information on setting a replication schedule.The data changes are persisted in a local file on the host running the publication server. The file is stored in the directory XDB_HOME/xdata.Each time persistence occurs, a new file is created. After the files have been processed, they are periodically removed from disk.For a multi-master replication system, transactions can be replicated from one master node to another by one of the synchronization methods described in the previous sections – either the trigger-based method (see Section 2.2.9) or the log-based method (see Section 2.2.10).For a single replication event to be considered finished and complete, transactions that have occurred on all master nodes since the previous replication event must be successfully replicated to all other master nodes by the configured synchronization method.This consists of a series of multiple replication sets, each identified by a master node acting as the source master node, which contains the transactions that needs to be replicated to all other master nodes acting as the target master nodes. So for a multi-master replication system consisting of n number of master nodes, there will be n such replication sets – each with a different master node acting as the source.Since the initial support of multi-master replication systems in xDB Replication Server version 5.0, such a series of multiple replication sets were always initiated in a strictly serial manner. That is, the transaction replication from a source master node to all target master nodes must be completed before the start of the transaction replication from the next master node to all other target master nodes, and so on.For example, consider a 3-master node system consisting of master node A, master node B, and master node C.If applications have applied transactions to tables in all three master nodes and then a synchronization replication event is initiated either on demand by the xDB Replication Console, an xDB Replication Server CLI command, or by a scheduled replication, the transactions are replicated in the following manner:
3. The time to complete the entire replication event, referred to as the latency time, is basically the sum of the replication times where each master node acts as the source (that is, the sum of the times for steps 1, 2, and 3).For the log-based method, this latency time has been reduced by the implementation of parallel replication whereby each replication set from a given master node acting as the source, executes and runs simultaneously with all other replication sets where the other master nodes act as the source.Thus, a replication set from a master node is not waiting for others to complete before it can start so steps 1, 2, and 3 all run simultaneously instead of one after the other.Note that parallel replication applies only to the log-based method and not for the trigger-based method.There is no required configuration setting to enable the use of parallel replication for the log-based MMR system.Note: In addition to parallel replication, optimization of replicating from a given master node to all other master nodes (that is, within the context of a single replication set) has been implemented with the use of multiple threads. This is referred to as parallel synchronization. Parallel synchronization applies to both the trigger-based and log-based methods. See Section 18.104.22.168 for information on parallel synchronization.2.2.12 Table FiltersTable filters specify the selection criteria for rows in publication tables or views that are to be included during replications to subscriptions from the publication database in a single-master replication system or between master nodes in a multi-master replication system. Rows that do not satisfy the selection criteria are excluded from replications to subscriptions or master nodes on which these table filters have been enabled.22.214.171.124 Implementing Table FiltersImplementing table filters is a two-part process. First, a set of available table filters must be defined. This can be done during the process of creating the publication by defining specific, named rules applicable to selected publication tables or views expressed in the form of SQL WHERE clauses.Once a set of available table filters have been defined, they must be enabled only on those subscription tables of a single-master replication system or master node tables of a multi-master replication system where filtering is to occur during replication to those particular target tables. No filtering occurs during replication to a target subscription table or master node table if no filters have been specifically enabled on that table in the subscription or master node.It is strongly recommended that a snapshot replication be performed to the subscriptions or master nodes that contain tables on which the filtering criteria has changed either by the addition of filter rules, the removal of filter rules, or the modification of existing filter rules.A snapshot ensures that the content of the subscription tables or master node tables is consistent with the updated filtering criteria.Note (For MMR only): When using table filters in a multi-master replication system, the master definition node, which provides the source of the table content for a snapshot, should contain a superset of all the data contained in the other master nodes of the multi-master replication system. This ensures that the target of a snapshot receives all of the data that satisfies any filtering criteria enabled on the other master nodes.On the contrary, if the master definition node contains only a subset of all the data contained in the other master nodes, then a snapshot to another master node may not result in the complete set of data that is required for that target master node.126.96.36.199 Effects of Table FilteringA filter enabled on a table only affects the results from snapshot or synchronization replications targeted to that table by the xDB Replication Server. Filtering has no effect on changes made directly on the target table by external user applications such as an SQL command line utility.Note: In the following discussion, a result set refers to the set of rows in a table satisfying the selection criteria of an UPDATE or DELETE statement executed on that table.In a snapshot replication, a row from the source table of the snapshot is inserted into the target table if the row satisfies the filtering criteria. Otherwise the row is excluded from insertion into the target table.When an INSERT statement is executed on a source table followed by a synchronization replication, the row is inserted into the target table of the synchronization if the row satisfies the filtering criteria. Otherwise the row is excluded from insertion into the target table.When an UPDATE statement is executed on a source table followed by a synchronization replication, the UPDATE result set of the source table determines the action on the target table of the synchronization as follows.When a DELETE statement is executed on a source table followed by a synchronization replication, the DELETE result set of the source table determines the action on the target table of the synchronization as follows.
• Thus, regardless of whether the transaction on the source table is an INSERT, UPDATE, or DELETE statement, the goal of a table filter is to ensure that all rows in the target table satisfy the filter rule.For replication systems using the log-based method of synchronization replication, a publication table on which a filter is to be defined must have the REPLICA IDENTITY option set to FULL.Note: This REPLICA IDENTITY FULL setting is not required for tables in single-master, snapshot-only publications, See Section 2.2.7 for information on snapshot-only publications.This setting is done with the ALTER TABLE command as shown by the following:ALTER TABLE schema.table_name REPLICA IDENTITY FULLFor additional information see the ALTER TABLE SQL command in the PostgreSQL Core Documentation located at:For example, for a publication table named edb.dept, use the following ALTER TABLE command:The REPLICA IDENTITY setting can be displayed by the PSQL utility using the \d+ command:TABLE "jobhist" CONSTRAINT "jobhist_ref_dept_fk" FOREIGN KEY (deptno) REFERENCES dept(deptno) ON DELETE SET NULLThe REPLICA IDENTITY FULL setting is required on tables in the following databases of a log-based replication system:
• In a single-master replication system, table filters are defined in the master database. Thus, the publication tables in the master database requiring filter definitions must be altered to a REPLICA IDENTITY FULL setting, but only if the publication is not a snapshot-only publication. See Section 2.2.7 for information on snapshot-only publications.
• In a multi-master replication system, non-MDN nodes should not have their tables’ REPLICA IDENTITY option set to FULL unless transactions are expected to be targeted on those non-MDN nodes, and the transactions are to be filtered when they are replicated to the other master nodes.The REPLICA IDENTITY FULL setting on a source table ensures that certain types of transactions on the source table result in the proper updates to the target tables on which filters have been enabled.Note: In addition to table filtering requirements, the REPLICA IDENTITY FULL setting may be required on publication tables for other reasons in xDB Replication Server. See Section 6.6.1 for additional requirements.Table filters are not supported on binary data type columns. A binary data type is the Postgres data type BYTEA. In addition, table filters are not supported on Advanced Server columns with data types BINARY, VARBINARY, BLOB, LONG RAW, and RAW as these are alias names for the BYTEA data type.188.8.131.52 Roadmap for Further InstructionsThe specific details on implementing table filtering depend upon whether you are using a single-master replication system or a multi-master replication system. The following is a roadmap to the relevant sections for each type of replication system.
2 Overview : 2.2 Replication Concepts and Definitions