EDB Docs - EDB Postgres AI v1.4.1 (LTS) - Agent Factory Architecture on Hybrid Manager

Architectural Overview

Agent Factory deploys as a collection of containerized services within Hybrid Manager's Kubernetes infrastructure, delivering sovereign AI capabilities through integrated model governance, inference serving, and Langflow-based AI flow development. The architecture ensures complete data sovereignty by processing all AI workloads within customer-controlled Kubernetes clusters, leveraging local GPU resources and object storage.

The system operates across three architectural layers: a control plane for governance and orchestration, a runtime layer for model serving and flow execution, and a storage layer for model artifacts and knowledge bases. These layers integrate through Kubernetes APIs and custom resources, providing unified management while maintaining isolation between projects and workloads.

Core Components

Model Library

The Model Library operates as a control plane service managing model lifecycle and governance across the platform. This service maintains a centralized registry of approved models while enforcing security and compliance policies before models reach production environments.

The library consists of several interconnected services:

Registry synchronization service that monitors external container registries
Policy engine evaluating models against organizational governance rules
Metadata service tracking model versions, performance benchmarks, and approvals
Storage interface managing model artifacts in object storage backends

Model metadata persists in PostgreSQL databases managed by Hybrid Manager, ensuring consistency with other platform data. The library exposes models to project namespaces through Kubernetes custom resources, enabling declarative model deployment while maintaining centralized governance. See also: Model Library explained.

Inference Server Infrastructure

Inference servers deploy as KServe InferenceServices within project namespaces, providing scalable Model Serving through specialized container pods. These pods encapsulate model runtime engines optimized for different frameworks and hardware configurations.

Inference pod configurations include:

Model runtime containers
Resource specifications defining GPU allocation, memory limits, and CPU requirements (see Setup GPU and Update GPU resources)
Volume mounts connecting to model storage and configuration data
Environment variables containing endpoint configurations and runtime parameters
Health check definitions for liveness and readiness probes

Autoscaling configurations respond to metrics including request latency, GPU utilization, and queue depth, ensuring optimal resource utilization while meeting performance targets. For deployment options, see Model deployment and Configure ServingRuntime.

Langflow Runtime

Langflow runs as a managed workload in Hybrid Manager, providing a visual flow builder and a deployment lifecycle for turning flows into callable services. The runtime is containerized and deployed flows are hosted in isolated Kubernetes namespaces.

The Langflow architecture within HM includes:

A shared Langflow editor environment where flows are built and tested
Per-deployment runtime containers that host published flows as long-running services
EDB components (EDB Model Server, EDB Embeddings, EDB Knowledge Base, and others) that wire flows to HM-managed resources
State and credentials managed as Kubernetes secrets, scoped to each deployment's namespace

Flows access model endpoints through cluster-local service DNS, and all traffic between the Langflow runtime, model server pods, and Postgres clusters stays within the project namespace. See Langflow for the full component and deployment reference.

Storage

Agent Factory uses object storage for model artifacts, datasets, and knowledge bases, with MinIO or cloud provider services (S3, Azure Blob, GCS) as primary storage backends. This separates compute from storage, enabling independent scaling and cost optimization.

Storage access occurs through standardized S3 APIs with authentication via service account credentials or cloud provider identity mechanisms. Persistent volume claims provide local caching for frequently accessed models, reducing network overhead and improving inference latency.

Infrastructure

Agent Factory runs on standard Kubernetes primitives — GPU device plugins, KServe, object storage, and a service mesh. For deep-dives on GPU setup, network policies, HA configuration, and monitoring, see the hub references:

Agent Factory Architecture on Hybrid Manager v1.4.1 (LTS)