Controlling schema ingestion Innovation Release
- Hybrid Manager dual release strategy
- Documentation for the current Long-term support release
The beacon-agent.yaml monitoring template allows the Agent to ingest all schemas that the migration user has permissions to and contain at least one table.
You can add further configurations to control schema ingestion.
Defining a subset of schemas for ingestion
If you don't want the Agent to ingest all schemas, you can define a subset with the databases.schema.filter option.
Using the schema filter
To control which schemas the Agent ingests from your database, add the filter parameter within the databases.schema configuration block in your beacong-agent.yaml. This allows you to specify a set of schemas to either include or exclude.
The filter parameter requires two settings:
mode: Defines the filtering behavior. It must be set to eitherincludeorexclude. When set toinclude, the Agent only ingests schemas listed in the names list. When set toexclude, the Agent ingests all schemas except those listed in the names list.names: A list of schema names to apply the filter to. Because the names are case-sensitive, ensure you use the same casing for the names as how the schemas are stored in the source database.
Note
The Agent has a default list of system schemas for both Oracle and Postgres that are automatically excluded from ingestion because they are irrelevant for the database migration. When an exclude filter is used, this default list is combined with any additional schemas you specify. For the include filter, the Agent will ingest any schemas you explicitly name, including system schemas that would have otherwise been excluded by default.
Include example
[...] provider: onprem: clusters: - resource_id: "<cluster_resource_id>" [...] nodes: - resource_id: "<node_resource_id>" dsn: $DSN schema: enabled: true poll_interval: 15s filter: mode: "include" names: - "<schema_name1>" - "<schema_name2>" tags: - "<tag_names>"
Exclude example
[...] provider: onprem: clusters: - resource_id: "<cluster_resource_id>" [..] nodes: - resource_id: "<node_resource_id>" dsn: $DSN schema: enabled: true poll_interval: 15s filter: mode: "exclude" names: - "<schema_name1>" - "<schema_name2>" tags: - "<tag_names>"
Excluded system schemas
The Agent excludes the following system schemas by default.
Postgres and EDB Postgres Extended Server
- information_schema
- pg_catalog
- pg_toast
EDB Postgres Advanced Server
- sys
Oracle
- ANONYMOUS
- APEX_PUBLIC_USER
- APEX_030200
- APEX_040000
- APEX_040200
- APPQOSSYS
- AUDSYS
- CTXSYS
- DMSYS
- DBSNMP
- DBSFWUSER
- DEMO
- DIP
- DVF
- DVSYS
- EXFSYS
- FLOWS_FILES
- FLOWS_020100
- FRANCK
- GGSYS
- GSMADMIN_INTERNAL
- GSMCATUSER
- GSMROOTUSER
- GSMUSER
- LBACSYS
- MDDATA
- MDSYS
- MGMT_VIEW
- OJVMSYS
- OLAPSYS
- ORDPLUGINS
- ORDSYS
- ORDDATA
- OUTLN
- ORACLE_OCM
- OWBSYS
- OWBYSS_AUDIT
- PDBADMIN
- RMAN
- REMOTE_SCHEDULER_AGENT
- SI_INFORMTN_SCHEMA
- SPATIAL_CSW_ADMIN_USR
- SPATIAL_WFS_ADMIN_USR
- SQLTXADMIN
- SQLTXPLAIN
- SYS
- SYSBACKUP
- SYSDG
- SYSKM
- SYSRAC
- SYSTEM
- SYSMAN
- TSMSYS
- WKPROXY
- WKSYS
- WK_TEST
- WMSYS
- XDB
- XS$NULL
Performing multiple concurrent schema ingestions
When you configure the EDB Postgres AI agent to ingest schemas from several Oracle databases, you can use the schema_export_max_workers parameter of the beacon-agent.yaml configuration file to control how many schema export jobs the EDB Postgres AI agent runs in parallel.
The schema_export_max_workers sets the maximum number of concurrent schema exports the EDB Postgres AI agent runs across all configured Oracle databases. Because active workers consume CPU, memory, and some disk space on the auxiliary machine that runs EDB Postgres AI agent, you can use this parameter to tune resource consumption, while speeding up or slowing down the bulk schema ingestion.
The schema_export_max_workers is in the provider.onprem section of the beacon-agent.yaml:
... # other beacon-agent.yaml configurations provider: onprem: runner: enabled: true schema_export_max_workers: 10 ...
Using the max workers parameter
Start with the default schema_export_max_workers: 10 when configuring initial schema ingestion. Then, monitor the auxiliary machine running the EDB Postgres AI agent (CPU, memory, disk I/O, network).
If the auxiliary machine is underutilized and you need faster ingestion for multiple databases, or if you notice that the databases are taking too long to appear in the Estate > Migrations tab, increase the value and observe the impact.
If resource usage is consistently high or causing contention (for example, you notice spikes in CPU, memory, and disk usage), decrease the value and confirm that the system stabilizes. This slows down total ingestion time but reduces the peak resource usage on the EDB Postgres AI agent host.
After each change, restart the Agent if it is not running as a service with automatic reload.