Installing FlowServer for WarehousePG
You can run the FlowServer service and the FlowCLI utility on any host that is able to reach your WarehousePG (WHPG) cluster. However, you must also install the packages on every host in your WHPG cluster.
Prerequisites
- WarehousePG (WHPG) version 6.x running on RH7 or RH8.
- WarehousePG version 7.x running on RH8 or RH9.
Network requirements
The following table lists the connection requirements among the different components:
| Source | Destination | Protocol |
|---|---|---|
| FlowServer | WarehousePG coordinator | libpq |
| FlowServer | WarehousePG segments | HTTP |
| FlowServer | Kafka broker hosts / RabbitMQ hosts | TCP |
| FlowCLI | FlowServer | gRPC |
Download and install the package on your WarehousePG cluster
Download the package from the EDB repository:
export EDB_SUBSCRIPTION_TOKEN=<your-token> export EDB_REPO=gpsupp curl -1sSLf "https://downloads.enterprisedb.com/$EDB_SUBSCRIPTION_TOKEN/$EDB_REPO/setup.rpm.sh" | sudo -E bash sudo dnf download whpg<whpg_major_version>-flow-server
Where
<whpg_major_version>is your WHPG version (6 or 7).Create a file
all_hostson your WHPG coordinator, which lists all hosts in the WHPG cluster. For example:cdw scdw sdw1 sdw2 sdw3
From the coordinator, use the
gpsshutility to install the packages from the coordinator onto every other host in the cluster:gpssh -f all_hosts -e 'sudo dnf install -y whpg<whpg_major_version>-flow-server'
gpssh -f all_hosts -e 'sudo yum install -y whpg<whpg_major_version>-flow-server'
(Optional) Create the FlowServer extension by connecting to a database on your WHPG cluster and running:
CREATE EXTENSION fs_formatter;
If you don't create the extension manually, it will be automatically created when a job starts.
Download and install the package on your dedicated FlowServer host / FlowCLI host (optional)
If you are running FlowServer on a different host to your WHPG cluster, or if you are planning in running FlowCLI commands from a different host, you must also download and install the packages on these hosts.
Download the package from the EDB repository:
export EDB_SUBSCRIPTION_TOKEN=<your-token> export EDB_REPO=gpsupp curl -1sSLf "https://downloads.enterprisedb.com/$EDB_SUBSCRIPTION_TOKEN/$EDB_REPO/setup.rpm.sh" | sudo -E bash sudo dnf download whpg<whpg_major_version>-flow-server
Install the package on the FlowServer dedicated host:
sudo dnf install -y whpg<whpg_major_version>-flow-server
sudo yum install -y whpg<whpg_major_version>-flow-server
Configure FlowServer
Create a configuration file flow_server.json on the host that will be running the FlowServer service and include the following content. The host might be the host within your WHPG cluster, or a dedicated server.
{
"Host": "",
"Port": 6060,
"Gpfdist": {
"Host": "",
"Port": 6070,
"ReuseTables": true
},
"Prometheus": {
"Host": "",
"Port": 9080,
"MetricsPath": "/flow_metrics"
},
"DebugPort": 6080,
"Logging": {
"SplitLogByJob": false,
"FrontendLevel": "debug",
"BackendLevel": "info"
}
}Where:
Host: The hostname or IP address of the server. The default is an empty string, which means it will listen on all interfaces.Port: The port number on which the server listens for incoming connections. The default is 6060.Gpfdist: Configuration options for thegpfdistservice.Host: The hostname or IP address of thegpfdistservice. The default is an empty string, which means it will listen on all interfaces.Port: The port number on which thegpfdistservice listens. The default is 6070.ReuseTables: Whether to reuse existing tables in the database. The default isfalse. When you reuse external tables, FlowServer generates the external table name using a hash of various load configuration property values. By default, FlowServer drops the external table associated with a load operation (if one exists) and creates a new external table when you start or restart the job. If you don't reuse external tables, the external table name is based on the job name.
Prometheus: Configuration options for the Prometheus metrics endpoint.Host: The hostname or IP address of the Prometheus service. The default is an empty string, which means it will listen on all interfaces.Port: The port number on which the Prometheus service listens.MetricsPath: The path to the metrics endpoint.
DebugPort: The port number for the debug server.Logging: Configuration options for logging. The supported values aredebug,info,warn,error, andfatal.SplitLogByJob: Whether to split logs by job. The default istrue, meaning logs will be separated by job.FrontendLevel: The logging level for the frontend/stdout. The default isinfo.BackendLevel: The logging level for the backend/log file. The default isdebug.
Start the FlowServer service:
Once you have configured the settings, start the FlowServer service on your preferred host, pointing to the configuration file flow_server.json you just created:
./flowserver -c /path/flow_server.jsonCould this page be better? Report a problem or suggest an addition!