Physical components v7

Replication Server isn't a single, executable program but rather a set of programs along with data stores containing configuration information and metadata. These components work together to form a replication system.

The following diagram shows the components of Replication Server and how they're used to form a complete, basic, single-master replication system.

Replication Server - physical view (single-master replication system)

The following diagram shows the components of Replication Server and how they're used to form a complete, basic, multi-master replication system.

Replication Server - physical view (multi-master replication system)

The minimal configuration of Replication Server for a basic replication system consists of the following software components:

  • Publication server The program that configures the publication database and primary nodes for replication and performs replication.
  • Subscription server The program that configures the subscription database for replication and initiates replication. The subscription server is used only in single-master replication systems.
  • Replication configuration file Text file containing connection and authentication information used by the publication server and subscription server upon startup to connect to a publication database designated as the controller database. Also used to authenticate registration of the publication server and subscription server from the user interface when creating a replication system.
  • Startup configuration file Text file containing installation and configuration information used for the Java Runtime Environment when the publication server and subscription server start.

The entire replication system is completed with the addition of the following components:

  • User interfaces for configuring and maintaining the replication system
  • One or more publication databases for a single-master replication system
  • One or more subscription databases for a single-master replication system
  • One primary definition node for a multi-master replication system
  • One or more additional primary nodes for a multi-master replication system

The user interface, publication server, subscription server, publication database, subscription database, and primary nodes can all run on the same host or on separate, networked hosts.

You can use any number of user interfaces to access any number of publication servers and subscription servers on the network as long as you know the network locations, user names, and passwords of the publication and subscription servers.

Any number of publication and subscription databases can participate in a single-master replication system.

Any number of primary nodes can participate in a multi-master replication system.

Publication server

The publication server creates and manages the metadata for publications. When a publication is created, the publication server creates database objects in the control schema of the publication database to record metadata about the publication.

Whenever a primary node is added to a multi-master replication system, the publication server creates database objects in the control schema of the primary node for recording metadata. For non-MDN nodes, the publication server also calls Migration Toolkit to create the publication table definitions if you choose this option when creating the primary node.

Note

See Control schema and control schema objects for information on the control schema.

The publication server is also responsible for performing a replication. For snapshot replications, the publication server calls Migration Toolkit to perform the snapshot.

For single-master synchronization replications, the publication server uses the Java Database Connectivity (JDBC) interface to apply changes to the subscription table rows based on changes recorded in either of two ways:

  • If the publication database is running under Postgres version 9.4 or later and the logical decoding option was chosen when creating the publication, changes are obtained from the Postgres WAL files using a logical replication slot.
  • In all other circumstances, changes are recorded in metadata tables (called shadow tables) in the publication database by row-based triggers that activate upon any insert, update, or deletion to the publication table rows.

For multi-master synchronization replications, the publication server performs the same process as for single-master synchronizations but does so for each primary node pair combination in the multi-master replication system.

The publication server can run on the same host as the other Replication Server components, or it can run on a separate, networked host.

Subscription server

Note

The subscription server is required only for single-master replication systems. The subscription server doesn't need to be running or installed if only multi-master replication systems are in use.

The subscription server creates and manages the metadata for subscriptions. When a subscription is created, the subscription server creates database objects in the control schema of the publication database to record metadata about the subscription.

When a subscription is created, the subscription server calls Migration Toolkit to create the subscription table definitions in the subscription database. The rows in the subscription tables aren't populated until a replication occurs. Rows are populated by actions of the publication server.

The subscription server is also responsible for initiating a replication as a result of manual user action through the user interface or a schedule created for the subscription. The subscription server initiates a call to the publication server that manages the associated publication. The publication server then performs the replication.

The subscription server can run on the same host as the other Replication Server components, or it can run on a separate, networked host.

When the subscription server starts, it uses the information in the replication configuration file found on its host to connect to the designated controller database.

Replication configuration file

The Replication Server replication configuration file contains the connection and authentication information used by any publication server or subscription server running on the host containing the file.

Specifically, the replication configuration file is accessed in the following circumstances:

  • When a publication server or subscription server is started on the host.
  • When a publication server or subscription server is registered during the process of creating a replication system. Register a publication server or subscription server using the Replication Server console or command line interface.

The following table contains a brief description of the parameters in the replication configuration file.

ParameterDescription
admin_userReplication Server administrator user name (the admin user name) for registering a publication server or a subscription server on this host containing the replication configuration file
admin_passwordEncrypted password of the admin user
databaseDatabase name of the controller database
userDatabase user name of the controller database
passwordEncrypted password of the controller database user
portPort number on which the database server of the controller database listens for requests
hostIP address of the host running the database server of the controller database
typeDatabase type of the controller database such as oracle or enterprisedb

Replication Server creates the content of this file as follows:

  • The replication configuration file and some of its initial content are created when you install a publication server or subscription server on a host during the Replication Server installation process.
  • Parameters admin_user and admin_password are determined during the Replication Server installation process. See Installation and uninstallation for how the content of these parameters are determined.
  • Parameters database, user, password, port, host, and type are set with the connection and authentication information of the first publication database definition you create with the Replication Server console or CLI. This database is designated as the controller database (see Controller database). See Adding a publication database for creating a publication database definition for a single-master replication system. See Adding the primary definition node for creating the publication database definition for a multi-master replication system.

The following is an example of the content of an EPRS Replication Configuration file:

#xDB Replication Server Configuration Properties
#Tue May 26 13:45:37 GMT-05:00 2015
port=1521
admin_password=ygJ9AxoJEX854elcVIJPTw\=\=
user=pubuser
admin_user=admin
type=oracle
password=ygJ9AxoJEX854elcVIJPTw\=\=
database=xe
host=192.168.2.23
Note

The passwords for the admin user name and the controller database user name are encrypted. If you change either of these passwords, you must modify the corresponding password parameters in the replication configuration file to contain the encrypted form of the new password. See Encrypting the password in the replication configuration file for directions on how to generate the encrypted form of a password.

See Installation details for the file system location of the EPRS Replication Configuration file.

Replication Server startup configuration file

The Replication Server startup configuration file contains installation and configuration information primarily used by the Java Runtime Environment (JRE) when any publication server or subscription server is started up on the host containing the file.

The content of the file is created by the Replication Server installer when you install Replication Server.

The following is an example of the content of an Replication Server startup configuration file:

#!/bin/sh
JAVA_EXECUTABLE_PATH="/usr/bin/java"
JAVA_MINIMUM_VERSION=1.8
JAVA_BITNESS_REQUIRED=64
JAVA_HEAP_SIZE="-Xms2048m -Xmx4096m"
PUBPORT=9051
SUBPORT=9052

The following table contains a brief description of the parameters in the Replication Server startup configuration file.

ParameterDescription
JAVA_EXECUTABLE_PATHDirectory path to the Java runtime program used to start and run the publication and subscription servers.
JAVA_MINIMUM_VERSIONThe earliest JRE version that can be used by the publication and subscription servers.
JAVA_BITNESS_REQUIREDThe bitness of the Java virtual machine required by the installed publication and subscription servers.
JAVA_HEAP_SIZEIn -Xmsnnnm nnn specifies the minimum Java heap size in megabytes. In -Xmxnnnm nnn specifies the maximum Java heap size in megabytes
PUBPORTPort number on which the publication server listens for requests.
SUBPORTPort number on which the subscription server listens for requests.

The JAVA_EXECUTABLE_PATH parameter specifies the location of the Java runtime program as identified by the Replication Server installer during the installation process. You can change the setting of this parameter to a different JRE installation if you want.

The JAVA_MINIMUM_VERSION parameter specifies the earliest version of the Java Runtime Environment that can be used with Replication Server. Don't change this setting.

Don't change the JAVA_BITNESS_REQUIRED parameter. If you change the installed value or if it doesn't match the bitness of the Java virtual machine as identified by JAVA_EXECUTABLE_PATH, errors can occur. These errors include failure of the publication and subscription servers to start and registration failure of the Replication Server product.

See Setting heap memory size for the publication and subscription servers for information on setting the JAVA_HEAP_SIZE parameter.

See Firewalls and access to ports for information on the PUBPORT and SUBPORT parameters.

After making any modifications to the Replication Server startup configuration file, restart the publication and subscription servers.

See Installation details for the file system location of the Replication Server configuration file.

Replication Server console

The Replication Server console is the user interface you can use to create and control all aspects of a replication system.

Through this console, you can configure and operate a replication system running on the same host on which the Replication Server console is installed. Or, you can configure and operate replication systems where the Replication Server components are distributed on different hosts in a networked environment.

EPRS Replication Consoles accessing multiple hosts

In the figure, two Postgres installations are running on two networked hosts, each with its own Replication Server installation. Each host is running a publication server and a subscription server.

The Replication Server console on each host can access and manage the replication systems on the other host if given the network IP address, port number, user name, and password with which the publication server and subscription server were installed on the remote host. See Introduction to the Replication Server console for information on the console user interface.

Replication Server command line interface

Replication Server command line interface (CLI) is a command-line-driven alternative to the Replication Server console user interface. The CLI provides equivalent functionality for creating and controlling all aspects of a replication system.

You can automate replication system operations by embedding Replication Server CLI commands in scripts such as Bash for Linux.

Replication Server CLI is installed whenever you install the Replication Server console.

Replication Server command line interface provides instructions for using Replication Server CLI.

Publication database

The publication database contains the tables and views used in a publication. The publication database can run on the same host or on a different host from where the publication server is running as long as the hosts can access each other by a network.

Each publication database also contains a control schema, which is a collection of database objects containing metadata on all replication systems, both single-master and multi-master, controlled by the publication server connected to this publication database. See Control schema and control schema objects for information on the control schema.

In a multi-master replication system, all primary nodes are considered publication databases.

A database plays the roles of both a publication database and a subscription database if it contains publications and subscriptions.

Subscription database

Note

The subscription database applies only to single-master replication systems.

The subscription database contains the tables created from a subscription. The subscription database can run on the same host or on a different host from where the subscription server is running as long as the hosts can access each other by a network.

A subscription database can also serve as a publication source for replicating to a third server if you want. This configuration is referred to as cascading replication.

A database plays the roles of both a publication database and a subscription database if it contains publications and subscriptions such as in the cascaded replication scenario.

Primary node

In a multi-master replication system, the databases containing the set of tables (the publication) for which row changes are to be replicated are called primary nodes. The primary nodes can run on the same host or on different hosts from where the publication server is running as long as the hosts can access each other by a network.

Each primary node also contains a control schema, which is a collection of database objects containing metadata on all replication systems, both single-master and multi-master, controlled by the publication server connected to this primary node. See Control schema and control schema objects for information on the control schema. The primary nodes can run under the same or under multiple database server instances (Postgres database clusters).

Primary definition node

The first node added to create a multi-master replication system is initially designated the primary definition node. This node must contain the table definitions (and optionally, the initial set of rows) to include in the publication.

As databases are added as primary nodes to the replication system, the table definitions and initial row sets can optionally be propagated from the primary definition node to the newly added primary nodes.

After the multi-master replication system is defined, it is possible to reassign the role of the primary definition node to another primary node in the multi-master replication system. The significance of this reassignment is that you can take snapshots from the newly appointed primary definition node to other primary nodes. This might help if the data in the old primary definition node becomes corrupt or out of sync with the other primary nodes and needs to be completely refreshed by a snapshot from another primary node.

As with all primary nodes, the primary definition node contains a control schema, which is a collection of database objects containing metadata on all replication systems, both single-master and multi-master, controlled by the publication server connected to this primary node.

Control schema and control schema objects

The control schema is a conceptual term referring to the collection of metadata database objects that define the logical and physical structure of and enable the operation and maintenance of Replication Server single-master and multi-master replication systems.

These metadata database objects, referred to as control schema objects, consist of tables, sequences, functions, procedures, triggers, packages, and so on.

The control schema objects store metadata such as:

  • Type of replication system (single-master or multi-master)
  • Network location
  • Database type
  • Connection and authentication information for publication databases
  • Subscription databases and the primary nodes, names of publications and the tables and views they contain,
  • Names of subscriptions and the publications to which they are subscribed
  • Replication transaction status
  • Replication scheduling
  • Replication history cleanup scheduling
  • Replication history

Each publication database in a trigger-based, single-master replication system also contains control schema objects with the changes that were made to rows in the publication and the status of whether those changes were applied to the subscription tables.

Similarly, for a multi-master replication system, each trigger-based primary node contains control schema objects with the changes made to rows in the publication residing on that primary node. They also contain the statuses of whether those changes were applied to the other primary nodes in the multi-master replication system.

Note

For log-based single-master and multi-master replication systems, changes are extracted from the database server WAL files instead of being stored in control schema objects. See Synchronization replication with the log-based method for information on the log-based method.

The actual, physical database schemas implementing the control schema to which the control schema objects belong varies depending on the database type (Oracle, SQL Server, or Postgres) and how the database was first configured for use by Replication Server.

Note the followoing about the control schema:

  • The control schema and its control schema objects are created in every publication database of both single-master and multi-master replication systems. This includes all master (publication) databases of single-master replication systems and all primary nodes of multi-master replication systems.
  • When a new primary database is added for a single-master replication system or a new primary node for a multi-master replication system, a snapshot operation is used to replicate the control schema to the newly added publication database, assuming there is an existing controller database. See Controller database for information about the controller database.
  • Updates to the configuration of a single-master replication system or a multi-master replication system made by the Replication Server console or the Replication Server command line interface are synchronized between the control schemas on all publication databases to ensure that the metadata is consistent across all publication databases.
  • The secondary (subscription) database of single-master replication systems contains one, single table as its metadata database object. The term subscription metadata object specifically refers to this database object in the subscription database. The general terms control schema and control schema objects refer to the database objects in the publication databases.
  • The control schema objects in all databases controlled by the same given publication server generally contain the same information. This configuration allows any such database to provide the information needed by Replication Server to control all single-master and multi-master replication systems running under that publication server.
  • If a certain publication database of a replication system goes offline due to database server problems, network connectivity issues, and so on, the other replication systems running under the same publication server are still functional. The other publication databases can provide the control schema information needed to run all replication systems.

Controller database

In the Replication Server configuration file, the connection and authentication information for one publication database is included and, as such, designated as the controller database.

As with all publication databases, the controller database contains the control schema with the replication system information for all single-master and multi-master replication systems run by the publication server that accesses that Replication Server configuration file.

The controller database serves as the primary provider of the replication system information to the publication server and the subscription server. Thus, upon initial startup, the publication server and subscription server try to connect to the designated controller database. This controller database then provides the metadata information for all replication systems.

If the initial connection to the controller database fails, you can manually edit the Replication Server configuration file to provide the connection and authentication information for another publication database. Then, upon startup of the publication server and subscription server, the control schema of this alternate publication database is used to provide the replication system information.

The initial controller database is determined by the first publication database definition created by the Replication Server console or the Replication Server CLI either for a single-master or multi-master replication system. The publication server records the connection and authentication information in the Replication Server configuration file.

If you want to delete the publication database definition of the current controller database, you must first designate another publication database, defined under the same publication server, as the controller using the Replication Server console. See Switching the controller database for information on switching the controller to another publication database.

The following are some points regarding the controller database:

  • The database server running the controller database must be running and accessible before starting the publication server and subscription server.

  • For a single-master replication system, consider:

    • The publication server under which the publication database and publication are defined
    • The subscription server under which the subscription database and the subscription related to the publication are defined

    Both must connect to the same the controller database. This configuration gives the publication server and the subscription server access to the same control schema.

  • When changes are made to the metadata maintained by the control schema in the controller database, these changes are replicated by the publication server to the control schemas of all other publication databases. This setup ensures that the metadata of all single-master and multi-master systems are complete and consistent in the control schemas of all publication databases. This setup also allows you to switch the controller database later. See Switching the controller database.

Note

If the controller database is an Oracle or a SQL Server publication database, then you can't add a second Oracle or SQL Server publication database to create a second single-master replication system. For Replication Server to run more than one single-master replication system consisting of Oracle or SQL Server publication databases, you must designate a Postgres publication database as the controller database.

Once you have multiple Oracle or SQL Server publication databases set up in single-master replication systems with a Postgres controller database, don't switch the controller database to an Oracle or SQL Server publication database.