EDB Docs - EDB Postgres AI Database v7

A knowledge base is a vector-indexed store of embeddings. It is created automatically when a pipeline includes a KnowledgeBase step — the pipeline handles embedding generation and indexing, and the knowledge base is the resulting queryable store.

Page	What it covers
Hybrid search	Combining semantic search with relational filters and BM25 keyword search
Vector extensions	VectorChord and VectorChord-BM25 for high-performance dense and sparse vector search
Examples	End-to-end worked examples for table and volume sources

Retrieval functions

Once a pipeline has run, query the knowledge base using aidb.retrieve_text() or aidb.retrieve_key(). Both use vector similarity to find results based on meaning rather than exact keywords, and support both TEXT and BYTEA (image) as the query input.

Flow of retrieval functions

When a retrieval function is called, the system performs the following steps internally:

Embedding: The input query (text or image) is converted into a vector using the specific embedding model configured for that knowledge base.
Similarity search: A vector similarity search is performed against the knowledge base's internal vector table to find the Top K nearest neighbors.
Source lookup (text only): For retrieve_text, the system identifies the source table and retrieves the raw content corresponding to the matched keys.

`aidb.retrieve_text()`

Use this function when you need to retrieve the actual source text associated with the closest vector matches.

Process: The function embeds your query, performs a similarity search, and then executes a second phase to look up the source text from the original table using the pipeline_id.
Returns: A set of columns including:
- key: The identifier from the source table.
- value: The actual source text.
- distance: The similarity score. A lower usually indicates a closer match.
- part_ids: An array of IDs indicating which specific chunks or parts were matched.
- pipeline_name: The name of the pipeline that supplied the data.
- intermediate_steps: A JSONB column containing data from steps occurring before the knowledge base. For example, ChunkText.

`aidb.retrieve_key()`

Use this function for high-performance searches where you only need the unique identifiers of the matches, rather than the full source content.

Returns: A set of columns including:
- key: The identifier from the source table.
- distance: The similarity score. A lower value usually indicates a closer match.
- part_ids: An array of IDs indicating which specific chunks or parts were matched.
- pipeline_name: The name of the pipeline that supplied the data.

Advanced querying: Joining intermediate steps

For pipelines that include intermediate transformations such as ChunkText or ParseHtml, you can access specific transformed segments by joining retrieval results with intermediate pipeline tables using the part_ids column.

Example syntax:

The following query joins the retrieval results with an intermediate step table to access specific chunked values:

SELECT 
    r.key, 
    r.value, 
    r.distance, 
    r.part_ids, 
    int_step.value AS chunked_content
FROM aidb.retrieve_text('my_kb', 'search query', 5) AS r
JOIN pipeline_my_pipeline_step_1 AS int_step
  ON int_step.source_id = r.key
  AND int_step.part_ids = (r.part_ids)[:1];

Knowledge bases v7

Retrieval functions

Flow of retrieval functions

`aidb.retrieve_text()`

`aidb.retrieve_key()`

Advanced querying: Joining intermediate steps

Example syntax:

← Prev

↑ Up

Next →