AIDB overview v7

AIDB is an EDB-maintained Postgres extension that brings AI data workflows directly into your Postgres database. It provides a declarative, SQL-native way to build pipelines that prepare, embed, and index data for AI applications — including Retrieval-Augmented Generation (RAG), semantic search, and large language model (LLM) integrations.

Because AIDB runs inside Postgres, your data never needs to leave the database. You define your AI workflows in SQL, and AIDB handles the complexity of embedding generation, vector indexing, model integration, and real-time synchronization.

What AIDB does

AIDB has two primary building blocks: pipelines and knowledge bases.

Pipelines define multi-step workflows that transform raw data into AI-ready vectors. Each step in a pipeline handles one transformation — parsing a PDF, chunking text, running OCR, summarizing content, or producing vector embeddings. Pipelines can run on demand, in batch, or automatically whenever source data changes.

Knowledge bases are the output of an embedding pipeline step. They store vector embeddings in a pgvector-indexed table and expose semantic search through the aidb.retrieve_key() and aidb.retrieve_text() functions. Multiple pipelines can feed into the same knowledge base, and a single knowledge base can serve multiple retrieval queries.

All pipeline operations are also available as standalone SQL functionsaidb.chunk_text(), aidb.parse_pdf(), aidb.perform_ocr(), aidb.summarize_text(), and others — so you can call them directly in queries without defining a full pipeline.

AIDB as a standalone extension

AIDB is a Postgres extension — it lives entirely inside your database and has no runtime dependency on AI Factory or Hybrid Manager. You install it like any other Postgres extension, configure it via postgresql.conf, and interact with it through SQL.

Note

If you're using AI Factory, AIDB is already included as part of that platform, but AI Factory is not required to use AIDB. This documentation covers standalone AIDB.

Key capabilities

Model flexibility — AIDB connects to locally running models (such as BERT and T5) as well as external models via the OpenAI-compatible API. Supported providers include NVIDIA NIM, Google Gemini, and any OpenAI-compatible endpoint. Model credentials are managed securely through the standard Postgres user mapping feature.

Data source support — Pipelines can read from standard Postgres tables or from external storage via the Postgres File System (PGFS) extension, which supports object store and local file systems. Pipelines output can be written to Postgres tables or back to a volume using PGFS.

Auto-processing — Pipelines can be configured to run automatically using Postgres trigger-based live mode or background worker-based scheduling. This keeps knowledge bases up to date as source data changes without manual intervention.

Schema flexibility — All AIDB objects (pipelines, knowledge bases, volumes) can be created in arbitrary Postgres schemas, making it straightforward to organize AI workloads alongside existing database structures.

Vector index control — Knowledge bases support configurable vector indexes, including HNSW and IVFFlat, via pgvector. VectorChord is also supported as a high-performance alternative, offering faster indexing and better throughput at scale.