Planning your architecture v1.3.4

Overview

Role: CTO / Architect / Lead Engineer

Prerequisites

  • The business goals your architecture needs to enable (examples: bugdet constraints, desired uptime, desired latency)

Outcomes

  • An Architectural Decision Record (ADR) defining the topology, locality, and redundancy model of your Hybrid Manager (HM) architecture. (at minimum: architecture diagrams with notes)

  • Initial inputs for the HM Helm chart configuration file: values.yaml.

Note

You, as the customer, ultimately own your deployment architecture. While EDB's Sales Engineering, Professional Services, Support Team, or documentation can be consulted, the final architectural decisions rest with your team.

Next phase: Phase 2: Gathering your system requirements

Architectural discovery

The goal of architectural discovery is to navigate and then document the necessary decisions to successfully deploy Hybrid Manager (HM). These decisions form the blueprint for meeting Infrastructure Requirements (Phase 2) and Preparing the environment (Phase 3).

The accompanying questions cover a broad set of considerations extending beyond just the database layer. This guide should be viewed from two perspectives:

  • Current state: Where your existing database and application workloads are today.

  • Target state: Where you intend to deploy HM immediately, and where you plan to expand over the next 1–2 years.

Recommendation: Acquiring and reviewing diagrams of your current and target state is the most efficient way to complete this phase.

Locality: Where will HM live?

Understanding the physical or logical locations of your database and dependent applications is crucial for determining the necessary architecture.

  • Questions to answer:

    • Where is the current database solution located in terms of cloud regions (CSP) or physical data centers (on-premises)?

    • Where are the dependent application workloads for these databases located?

    • Are there upstream layers of dependency, and where are those located?

  • Analysis:

    • Locality determines the initial scope of the deployment (e.g., single cloud region vs. multi-region).
    • If you plan to span multiple regions, clouds, or hybrid cloud environments, Postgres Distributed is likely the appropriate database service recommendation.
    • The locality of upstream applications is key to minimizing network latency.

Disaster recovery (hot/cold)

Disaster recovery (DR) ensures business continuity across different locations.

  • Questions to answer:

    • How is disaster recoveryas a subset of business continuityaccomplished across these locations, or is there an additional location assigned specifically as disaster recovery?
    • Is there an additional location assigned specifically as DR?
    • How is DR capability validated, and how often?
  • Analysis:

    • Having a dedicated secondary location indicates a strong architectural requirement.
    • If no formal DR practice exists, the HM DBaaS far-away replica solution may provide new capabilities.

Activeness (Active/Passive vs. Active/Active)

Activeness describes how your distributed locations are utilized for critical workloads.

  • Questions to answer:

    • If you have multiple locations, how does the critical dependent workload utilize these systems?
    • Is one location active and the other passive for transaction processing (OLTP)?
    • Is one location active for OLTP, and the other active for analytical processing (OLAP/BI)?
  • Analysis:

    • If your target state requires simultaneous writes to multiple database instances (i.e., true active/active across locations), Postgres Distributed is the required solution due to its multi-writer capability..
    • Understanding whether a location is passively waiting (cold standby) or actively running (hot standby) helps define resource requirements and recovery time objectives (RTO).
    • Business continuity: The architectural choices around active/passive, active/active, and standby models must balance the organization's tolerance for downtime/data loss against the cost of maintaining redundant systems.

These topics naturally follow the discussion of Activeness and help complete the picture of your application ecosystem.

  • Ingress traffic routing in terms of the consuming application.
  • Replication at various application layers.
  • Caching layers (and their location relative to the database).
  • Session demands (e.g., is session replication handled at the application layer?).

Lifecycle operations

Understanding your operations practices helps determine the complexity of the Kubernetes environment required to manage the database service.

  • Questions to answer:

    • Do you utilize lifecycle operations patterns such as Blue/Green or Canary?
    • How do you handle DML/DDL updates (data and schema) vs. engine upgrades (major versions)?
    • What pre-production environments (staging, development, testing) are required?
  • Analysis:

    • Practices like Blue/Green deployment align well with the zero-downtime features offered by EDB's database solutions.
    • The number of pre-production environments directly influences the total cluster count and resource sizing defined in System requirements.

Supported platforms

HM and Kubernetes have a 1:1 relationship—each HM deployment requires its own dedicated Kubernetes cluster. The Kubernetes cluster must be dedicated to HM in its current version; sharing with other workloads is not supported.

  • Amazon EKS (Elastic Kubernetes Service)
  • Google GKE (Google Kubernetes Engine)
  • Rancher RKE2 (Rancher Kubernetes Engine)
  • Red Hat OpenShift (RHOS)
Note

The customer is responsible for the full lifecycle management of the Kubernetes cluster (provisioning, upgrades, scaling).

HM distributed reference architecture

This reference architecture represents the ultimate goal for achieving the highest levels of SLA and scale. It typically spans multiple data centers.

HM reference architecture

Diagram legend reference

The legend defines the colors and logical groupings used in the architecture diagram:

  • Locality: The highest-level physical or logical grouping, such as a physical data center or a geographical region (e.g., "City 1" and "Data Center 1").
  • Kubernetes Cluster: The complete Kubernetes environment—including all CP and worker nodes—hosting the entire platform.
  • EDB HM: The logical boundary for the core HM components. This is typically implemented as a dedicated Kubernetes namespace (e.g., control-plane).
  • Compute Machine: The virtual machines (e.g., vm01, vm02, vm03) that serve as the Kubernetes worker nodes, providing the CPU, memory, and storage for the cluster.
  • Infrastructure Abstraction: This critical layer represents Kubernetes-native resources that abstract underlying physical or virtual infrastructure. These resources must be provided by the Kubernetes cluster's environment.
    • Example 1: type: LoadBalancer: This is a Kubernetes Service type that requests an external load balancer. In public cloud environments (like AWS, GCP, Azure), this is automatically provisioned as a managed service. In on-premises or bare-metal deployments, you must provide a solution (like MetalLB) to fulfill these LoadBalancer requests.
    • Example 2: StorageClass: This resource abstracts the "Block Storage" and "Object Storage" requirements. It maps Kubernetes storage requests (Persistent Volume Claims) to actual, provisioned storage hardware or software (like local-pv, Ceph, vSphere, or cloud-based disks).

Deployment architectures

Use these reference models to decide which topology matches your "Target State."

Note

This legend above also applies to reference architectures A-D below.

A. Minimum Control Plane

The minimum install colocates the HM control plane (CP) on the Kubernetes control nodes.

This is fully functional for:

  • Centralizing a view of your Postgres/Oracle Estate.
  • Database migration capabilities.
  • GenAI (limited capabilities due to lack of managed Postgres instances).

HM minimum

Internal architecture: HM Control Plane

HM is composed of several core microservices running within the Kubernetes cluster. Understanding these components is helpful for planning resource allocation and security boundaries.

Architectural dependencies

The architecture diagrams above reference several external components. While you verify the specific hardware/software requirements for these in Phase 2: Gathering your system requirements, you must account for their connectivity in your architectural design.

  • Identity provider (IdP): Required for user authentication. The architecture relies on OIDC (LDAP/SAML) for all human access.
  • Key Management Service (KMS): (Optional) Required only if your security policy demands Transparent Data Encryption (TDE).
  • Object Storage: Required for system resilience. It hosts backups, logs, and facilitates data replication for Multi-Location topologies.
  • Block Storage: Required for database performance. Your storage architecture must provide persistent volumes (PVCs) for the Postgres data layer.
  • Local network: The fabric connecting the CP to Data Plane. Latency here drives your Locality decisions.
  • Container Registry: The source of truth for application images. For air-gapped designs, this represents your local synchronized registry.

B. HM Data Plane (Postgres lifecycle orchestration)

Sitting alongside the HM CP is the HM Data Plane (DP). This is where your actual database workloads reside.

  • Postgres clusters: The actual database instances (Primary and Standbys).

  • Extensions: PostGIS, PGVector, and other database extensions.

  • Backup agents: Local tools (like Barman) managing WAL archiving to your Object Storage.

HM Data Plane

This view shows a fully capable HM deployment, including resources like GPU acceleration for AI workloads.

HM fully featured

D. Multi-Location (Hub and spoke)

The multi-location capability is a DBaaS offering following a hub and spoke model.

  1. As a DBaaS offering, secondary HMs have a reduced capability set compared to the Primary.
  2. The Primary HM controls the Secondary.
  3. Connectivity is established via load-balanced endpoints, not a network mesh service (like Submariner).

HM multi-location

Impact on configuration

The decisions made during this discovery process directly determine the root parameters of your installation configuration.

While you do not need to create the file yet, your Architecture Decision Record should specify the values for these keys. The SRE/Admin uses these specs to build the values.yaml file in Phase 3: Preparing the Environment.

Configuration details

Architecture decisionConfig parameter (values.yaml)Example value
Kubernetes Platformsystemeks, gke, rhos
Target locationparameters.upm-beacon.beacon_location_idaws-us-east-1
Provisioning modebeaconAgent.provisioning.provideraws or gcp

Impact on configuration file

Here is how your decisions map to the final configuration file structure of the HM Helm chart values.yaml you create in Phase 3:

system: <Kubernetes_Flavor> # e.g., rhos, rke2, eks, gke
bootstrapImageName: [https://docker.enterprisedb.com/pgai-platform/edbpgai-bootstrap/bootstrap-](https://docker.enterprisedb.com/pgai-platform/edbpgai-bootstrap/bootstrap-)<Kubernetes_Flavor>
bootstrapImageTag: <Version>
parameters:
  upm-beacon:
    beacon_location_id: <Deployment_Location_Name> # Identified in Phase 1: a simple string which will be a hint in the UI to identify this location.
beaconAgent:
    provisioning:
        provider: <Provider_Name> # AWS or GCP
        openshift: <Boolean_Value> # Defaults to `false`, set to true if deploying on RHOS

Next phase

Your architecture is defined and ideally recorded in an ADR for reference.

Proceed to Phase 2: Gathering system requirements to verify that your infrastructure can match your designs in your ADR. →