EDB Docs - EDB Postgres AI Database v7

A pipeline is the core building block of AIDB. It defines how raw data — from a Postgres table or an external volume — flows through a sequence of transformation steps and lands in an AI-ready destination.

Source  →  Step 1       →  Step 2  →  ...  →  Knowledge Base
           (parse/chunk)   (embed)             (indexed + queryable)

Each step handles one transformation: parsing a PDF, chunking text, running OCR, summarizing content, or generating vector embeddings. The output of one step becomes the input for the next. At the end of the pipeline, your data is embedded, indexed, and ready to query with semantic or hybrid search.

Pipelines can run on demand, in batch, or automatically whenever source data changes — keeping your knowledge base in sync without manual intervention.

Page	What it covers
Overview	Core concepts: sources, steps, destinations, and how pipelines relate to knowledge bases.
Creating pipelines	Defining a pipeline with `aidb.create_pipeline()` — source, steps, auto-processing, and volume sources.
Pipeline steps	Available step types: ChunkText, ParseHtml, ParsePdf, PerformOcr, SummarizeText, KnowledgeBase.
Orchestration	Auto-processing modes, background workers, observability, and error handling.
Reference	Full API reference for pipeline types, views, CRUD functions, and config helpers.
Example	End-to-end worked example.

AI pipelines v7

← Prev

↑ Up

Next →