Configuring the EDB Postgres AI agent Innovation Release

This documentation covers the current Innovation Release of EDB Postgres AI. See also:

Hybrid Manager dual release strategy
Documentation for the current Long-term support release

Configure the Hybrid Manager (HM) agent (deployed as beacon-agent) to allow it to connect to HM and fetch monitoring data from the source database.

Info

You can also use the agent to assess your Oracle database for migration.

Prerequisites

Install the agent.
You created a machine user and access key with the estate ingester role assigned to it. Store the access key in an environment variable named $BEACON_AGENT_ACCESS_KEY so the EDB Postgres AI agent can access it later.

Obtain Hybrid Manager connection parameters

The EDB Postgres AI agent requires specific configuration values that are set during the HM installation. You cannot access these yourself. Contact the administrator or installer of your HM instance to request the following values:

Beacon server hostname: When configuring the HM installation, administrators/installers must specify an internal endpoint for the beacon_server service. Request the administrator or installer of your HM instance to provide the hostname (URL) they set for the beacon_server service in the Helm chart configuration file used for installation. For this, they'll have to look up the value they set for parameters.upm-beacon.server_host (or BEACON_SERVICE_DOMAIN_NAME) in the values.yaml file. You will need this value later to configure the EDB Postgres AI agent.
Root certificate path (if required):The EDB Postgres AI agent requires a trusted connection to the beacon_server service in your HM instance. If the server running the agent doesn't inherently trust the HM's TLS certificate (managed by your organization's security infrastructure), request the administrator or installer of your HM instance to provide the file with the certificate. Store this certificate on the machine running the EDB Postgres AI agent. You will need to provide the directory path to this certificate later to configure the EDB Postgres AI agent.

See Administrative tasks for more information about the beacon_server and root certificate.

Define source database connection parameters

The DSN is the connection string the EDB Postgres AI agent uses to connect to your source database. Set the databases.dsn parameter using an environment variable $DSN (recommended) or direct URL format.

A DSN string for connections to databases must use this format:

DSN="<postgresql/oracle>://<user>:<password>@<host>:<port>/<database_name>"

Where:

<postgresql/oracle> is the type of database you are registering, either postgresql for all types of Postgres databases (EDB Postgres, AWS RDS for PostgreSQL, etc.) or oracle for Oracle databases.
The <username> and <password> correspond to the user you granted permissions when preparing the database.
<host> is the hostname of the instance where your database is running.
<port> ist the port number of the database you want to fetch data from.
<database_name> is the name of the database you want to fetch data from.

Examples:

Postgres (SSL on)
Postgres (SSL off)

In most production environments, SSL is enabled to ensure network traffic is encrypted.

To connect as user postgres authenticated with password password to a database postgres on port 5432 with SSL enabled, you would specify:

DSN="postgresql://postgres:password@localhost:5432/postgres"

If your Postgres database is configured with SSL disabled, or if you require an unencrypted connection, append ?sslmode=disable to the DSN string.

For example, to connect as user postgres authenticated with password password to a database postgres on port 5432 with SSL disabled, you would specify:

DSN="postgresql://postgres:password@localhost:5432/postgres?sslmode=disable"

Special characters in passwords

If your Postgres or Oracle user's password contains special, non-ASCII characters, use the URL-encoded version of the password in the DSN string (or password environment variable). For example, if your password is pa$$word, you must encode the $$ characters as %24%24:

DSN="postgresql://postgres:pa%24%24word@localhost:5432/postgres?sslmode=disable"

If you plan to connect to more than one database with this agent, you can specify multiple such variables, for example DSN1, DSN2. You can use any name you wish for these variables because they will be referenced by name in the configuration file.

Prepare a configuration file

Change to the OS user you previously created to perform the agent configuration.
Create a configuration directory in your home directory:
```
mkdir ${HOME}/.beacon
```
If this location isn't convenient, you can also use either:
- /etc/beacon
- The directory from which you execute any beacon-agent command.
The agent looks for its configuration file starting with /etc/beacon, then ${HOME}/.beacon. As a final fallback, it searches the directory from which it's executed.
Inside this directory, create a new file beacon_agent.yaml:
```
touch .beacon/beacon_agent.yaml
```

Copy and paste the following template into the new file. This template is designed to enable all monitoring options, and generate a migration assessment.

In this template, $BEACON_AGENT_ACCESS_KEY and $DSN are the environment variables you configured previously. Replace all < > placeholders in the template following the Parameter reference.

---
agent:
  access_key: $BEACON_AGENT_ACCESS_KEY
  beacon_server: <beacon_server>:9443
  project_id: <project_id>
  providers:
    - "onprem"
  settings_providers:
    - "onprem-settings"
  schema_providers:
    - "onprem-schema"
  root_ca_path: ""
general:
  logging:
    level: "info"
  metrics:
    push:
      enabled: true
      debug: true
      push_endpoint: <beacon_server>:9443
      root_ca_path: ""
    usage:
      company_code: <company_code>
      output:
        file:
          enabled: true
          path: beacon_usage.json
        http:
          enabled: false
          url: https://pg-usage.enterprisedb.com
provider:
  onprem:
    runner:
      enabled: true
    clusters:
      - resource_id: <cluster_resource_id>
        name: "<cluster_name>"
        manager: "other"
        nodes: 
          - resource_id: <node_resource_id>
            dsn: $DSN
            disable_host_association: false
            tags: 
              - "<tags>"
            settings:
              enabled: true
            metrics:
              disabled: false
              stats:
                enabled: true
                wait_states:
                  enabled: true
                  buckets_count: 5
                  bucket_duration: 1m
                  bucket_offset: 1s
                recommendations:
                  enabled: true
                  interval: 1m
                query_texts:
                  enabled: true
                  scraping_interval: 2m
                  scraping_offset: 1s
                  eviction_interval: 30m
                  cache_items_max: 200
                query_stats:
                  enabled: true
                  buckets_count: 2
                  bucket_duration: 1m0s
    host:
      metrics:
        disabled: false
      resource_id: <host_id>
dispatcher:
  enabled: true
  mode: standalone
  location_id: <location_id>

Save the file.

You have now configured the EDB Postgres AI agent, you can proceed to running it or configuring it to run as a service.

Recommendations for setting environment variables

Although it's possible to set all parameters directly in the config file, we recommend handling sensitive data, such as $BEACON_AGENT_ACCESS_KEY or $DSN as environment variables managed through a secrets manager. This strategy prevents sensitive credentials from being stored in plaintext configuration files. If you use environment variables, you must ensure they are available to the agent while it is running.

Consult your organization's IT department on the safest approach for production environments.

For short-lived migrations, or testing and demo purposes, you can use the export command in an active terminal session.

Test the agent locally

To perform an initial test of the agent, you can configure it to send the data—usually sent to Hybrid Manager—to your terminal (standard output) instead. This helps you quickly verify whether the agent can successfully collect data from the source database and lets you preview the gathered information.

You can run the agent in standard output mode by modifying the beacon_agent.yaml file and setting agent.beacon_server to "stdout":

agent:
    beacon_server: "stdout"

Next, run the agent in this mode:

beacon-agent

See Agent CLI for other agent modes and options.

The output is similar to this:

{"level":"debug","data":"$BEACON_AGENT_ACCESS_KEY","time":"2024-05-08T18:40:34Z","message":"expanding environment variable in configuration"}
{"level":"info","path":"/healthz","time":1715193634,"msg":"serving liveness probe"}
{"level":"info","path":"/readyz","time":1715193634,"msg":"serving readiness probe"}
{"level":"info","version":"v1.51.0-snapshot8986075626.97.1.166215e","time":1715193634,"msg":"starting beacon agent"}
{"level":"info","spiffe_enabled":false,"time":1715193634,"msg":"configuring tls"}
{"level":"info","server":"stdout","time":1715193634,"msg":"connecting to beacon service"}
{"level":"info","address":":8081","time":1715193634,"msg":"starting probe server"}
{"level":"info","target":"stdout","time":1715193634,"msg":"connected to beacon server"}
{"level":"info","time":1715193634,"msg":"verifying connection to beacon server"}
{"level":"info","project":"echo","time":1715193634,"msg":"verified connection to beacon server"}
{"level":"info","interval":"10m0s","time":1715193634,"msg":"loading feature flags periodically"}
{"level":"info","time":1715193634,"msg":"fetching feature flags"}
{"level":"info","feature_flags":{"echo_flag":"test","second_flag":false},"time":1715193634,"msg":"loaded feature flags"}
{"level":"info","id":"onprem","time":1715193634,"msg":"starting provider"}
{"level":"info","disable_partitioning":false,"batch_size":100,"interval":"10s","time":1715193634,"msg":"starting batch exporter"}
{"level":"info","provider_id":"onprem","time":1715193634,"msg":"registering ingestion worker in pool"}
{"level":"info","provider":"onprem","time":1715193634,"msg":"starting provider worker"}
{"level":"info","ingestions":[{"version":"v0.1.0","type":"onprem/host","id":"ip-10-0-128-121","metadata":{"Data":{"OnPremHostMetadata":{"hostname":"ip-10-0-128-121","operating_system":"linux","platform":"ubuntu","platform_family":"debian","platform_version":"22.04","cpu_limit":1}}}}],"time":1715193934,"msg":"sending ingestions via log client (not actually sending)"}
{"level":"info","successful_ingestions":1,"failed_ingestions":0,"time":1715193934,"msg":"exported ingestions"}

Note

The message in the second-to-last line of the log confirms that you're viewing the gathered data that's being output to stdout.

Next, set the agent.beacon_server value to again point to the host URL:

agent:
    beacon_server: <beacon_server>:9443

Validate the configuration file

Next, you can validate whether the agent can connect to the Hybrid Manager using the specified credentials by making a test connection to the Hybrid Manager and printing a summary of your configuration.

beacon-agent --config

See Agent CLI for other agent modes and options.

If the connection succeeds, a summary of the connection details is printed.
If the output shows an error, you must resolve it before proceeding.
If the Connectivity check fails with a message like failed to verify certificate x509..., this likely means that the system on which you're running the agent doesn't trust the certificate used to sign the server certificate. This is likely a symptom of an issue with root certificate distribution. Raise this issue with the team that manages your infrastructure.
If you have a copy of the root certificate, you can provide the path to the root certificate by setting agent.root_ca_path and general.metrics.push.root_ca_path. After setting these parameters, rerun beacon-agent --config to check whether the issue is resolved.
If the Connectivity check fails with a message like connection reset by peer, this generally means the agent can't reach the specified <beacon-server> because the URL is incorrect or there's a firewall preventing connection.

The agent is now configured, and you can run the agent or configure the agent to run as a service. Once your agent is running, consider configuring usage reporting.

Parameter reference

These settings are relevant to the agent's configuration.

YAML agent settings	Placeholder	Guidelines
`agent.access_key`	`$BEACON_AGENT_ACCESS_KEY`	Access key you obtained from the HM console when you created a machine user with the `estate ingester` role. After creating the key, you can store it as the environment variable `$BEACON_AGENT_ACCESS_KEY` for usage with the EDB Postgres AI agent.
`agent.beacon_server`	`<beacon_server>:9443`	Host URL that allows the agent to connect to the correct service endpoint. Obtain the host URL of your beacon server from an administrator, the port is fixed as 9443. See Administrative tasks for more information.
`agent.project_id`	`<project_id>`	To obtain your project’s ID, go to the HM console, select Projects, and select your project. Your browser displays a URL like `https://portal.example.com/projects/prj_5Wuqyl5JtpvxjiYC/clusters`. The project ID is the identifier starting with “prj_” in the URL of that page. In this example: `prj_5Wuqyl5JtpvxjiYC`.
`agent.providers`	N/A	Set to `- "onprem"`.
`agent.settings_providers`	N/A	Set to `- "onprem-settings"` to enable recommendations for Postgres configurations.
`agent.schema_providers`	N/A	Set to `- "onprem-schema"` to enable schema ingestion and migration assessment capabilities. Remove this section if you don't want to allow schema information to be collected.
`agent.root_ca_path`	N/A	Set to `""` if your organization has configured the machine where EDB Postgres AI agent runs to trust HM's TLS certificate. Otherwise, obtain the root certificate from an administrator and store it locally. Then, set this value to the path where you stored it. See Administrative tasks for more information.
`general.logging.level`	N/A	Set to `"info"` or `"trace"` to obtain logging information. Start with `"info"` and only use `"trace"` if required, as it is extremelly verbose.
`general.metrics.push.[...]`	N/A	Enables pushing metrics to the HM console. The `push_endpoint` is the same as `agent.beacon_server`. The `root_ca_path` is the same as `agent.root_ca_path`.
`general.metrics.usage.[...]`	N/A	Enables storing usage data (and reporting usage data, if enabled). Your `company_code` can be found on the EDB Support Portal.
`provider.onprem.check_databases_DSN`	N/A	Enables verifying all DSN connections and reports any errors to the terminal by indicating the source of the error with the resource ID during runtime. The check is performed when the EDB Postgres AI agent runs without any CLI subcommands/options, or when it runs with the `--config` subcommand. The default is set to false.
`dispatcher.location_id`	`<location_id>`	A unique identifier used by the Hybrid Manager for this agent/location. Consider setting this to the same value as `provider.onprem.host.resource_id` for convenience.

These settings are relevant to the database configuration.

YAML database settings	Placeholder	Guidelines
`clusters.resource_id`	`<cluster_resource_id>`	Assign a unique identifier to your database cluster. Encase in double quotes. You must begin identifiers with an alphanumeric character (`A-Z, a-z, 0-9`); in between you may also include hyphens (`-`), underscores (`_`) and periods (`.`). You must not use any other characters. Example: `An_I.D.-1`
`clusters.name`	`"<instance_name>"`	Assign a unique name to your database cluster.
`clusters.manager`	`"other"`	Collects statistics for cluster manager. Possible values are `"patroni"`, `"repmgr"`, `"efm"`, `"other"`. For self-managed instances, set to `"other"`.
`clusters.nodes.resource_id`	`<node_resource_id>`	Assign a unique identifier to your database node. Single-node instances require a single entry here. Multi-node instances require an entry per node. Encase in double quotes. Reuse the same id when configuring the DMS agent for migrations. This consistency is essential for the DMS to correctly correlate the resource across both the EDB Postgres AI agent and DMS agent components. You must begin identifiers with an alphanumeric character (`A-Z, a-z, 0-9`); in between you may also include hyphens (`-`), underscores (`_`) and periods (`.`). You must not use any other characters. Example: `An_I.D.-1`
`clusters.nodes.dsn`	N/A	Connection string or DSN that provides access to your remote database. The DSN must follow the format: `<database_type>://<user>:<password>@<host>:<port>/<database_name>`. The `database_type` can be postgresql or oracle.
`clusters.nodes.disable_host_association`	N/A	Set to `true` to prevent this database being associated with the host on which the agent is running. For example, if the agent is running on a different server.
`clusters.nodes.tags`	N/A	Assign tags for resource labeling. Encase in double quotes and introduce each tag with a dash.
`clusters.nodes.settings.enabled`	N/A	Set to `true` to enable collecting database configuration data.
`clusters.nodes.metrics.[...]`	N/A	Controls which types of metrics the agent ingests, and how they are ingested.

These settings are relevant to host data.

YAML database settings	Placeholder	Guidelines
`host.metrics.disabled`	N/A	Set to `true` to disable the collection of host metrics from the host on which the agent is located. For example, when the agent host is not the same as the database cluster host.
`host.resource_id`	`<host_id>`	Assign an identifier to the host on which the agent is running, by default this will be populated by the hostname detected during setup. This will link your host machine to an identifier. Encase in double quotes. Must be unique within the project.

Settings related to schema ingestion:

YAML database schema settings	Placeholder/Value	Guidelines
`provider.onprem.schema_export_max_workers`	N/A	Maximum number of workers the EDB Postgres AI agent starts in parallel to extract schema from the source database server(s). You can control CPU, memory and disk space usage by the EDB Postgres AI agent witht this parameter. See Performing multiple concurrent schema ingestions for more information. The default is set to 10.
`clusters.nodes.schema.enabled`	N/A	Set to `true` to enable schema ingestion for the database. Set to `false` to disable it.
`clusters.nodes.schema.poll_interval`	N/A	Defines how frequently the agent checks for schema changes. Provide a value in seconds or minutes (for example, 15 s or 1 m).
`clusters.nodes.schema.filter`	N/A	Use this block to specify which schemas to include or exclude from ingestion. It contains the `mode` and `names` parameters.
`clusters.nodes.schema.filter.mode`	`include` or `exclude`	Specifies the filtering behavior. Use `include` to only ingest listed schemas, or `exclude` to ingest all schemas except those listed.
`clusters.nodes.schema.filter.names`	`<schema_name>`	A list of strings representing the names of the schemas to be filtered. This list works in conjunction with the mode setting.

Fine-tuning monitoring data collection ► Learn how to customize the monitoring features and enable/disable a subset of them.
Configuring multiple databases ► Learn how to use the same agent config file to connect and fetch data from more than one database.
Monitoring EDB Failover Manager Cluster ► Learn how to use the same agent config file to monitor the EDB Failover Manager (EFM) Cluster.
Enabling usage reporting in Hybrid Manager ► Learn how to complement monitoring efforts by providing EDB with basic insights into product usage.
Administrative tasks ► Learn how an HM instance's administrator can obtain the values for the root certificate and beacon_server.
Use the agent to assess your Oracle database for migration ► Use the EDB Postgres AI agent to asses Oracle databases and extract schema for conversion with the Migration Portal.
Perform multiple concurrent schema ingestions ► Adjust the EDB Postgres AI agent configuration to tune EDB Postgres AI agent's resource utilization and support the ingestion of a large number of databases.

Configuring the EDB Postgres AI agent Innovation Release

Info

Prerequisites

Obtain Hybrid Manager connection parameters

Define source database connection parameters

Special characters in passwords

Prepare a configuration file

Recommendations for setting environment variables

Test the agent locally

Note

Validate the configuration file

Parameter reference

Monitoring EFM Cluster

Fine-tuning

Multiple databases

Schema ingestion

Adding usage reporting

← Prev

↑ Up

Next →

Configuring the EDB Postgres AI agent Innovation Release

Info

Prerequisites

Obtain Hybrid Manager connection parameters

Define source database connection parameters

Special characters in passwords

Prepare a configuration file

Recommendations for setting environment variables

Test the agent locally

Note

Validate the configuration file

Parameter reference

Related topics

Monitoring EFM Cluster

Fine-tuning

Multiple databases

Schema ingestion

Adding usage reporting

← Prev

↑ Up

Next →