EDB Docs - EDB Postgres AI Database v7

aidb.summarize_text() condenses a single text string into a concise summary using a registered language model. aidb.summarize_text_aggregate() is the SQL aggregate form — use it with GROUP BY to summarize text across multiple rows, returning one summary per group.

For parameter tables and return types, see Reference → Functions.

aidb.summarize_text()

Step 1: Register a model

SELECT aidb.create_model('my_t5_model', 't5_local');

Step 2: Summarize a text string

SELECT * FROM aidb.summarize_text(
    input   => 'There are times when the night sky glows with bands of color...',
    options => '{"model": "my_t5_model"}'
);

Output

                                              summarize_text
-------------------------------------------------------------------------------------------------------------------------
 the night sky glows with bands of color . they may begin as cloud shapes and then spread into a great arc across the entire sky .
(1 row)

Compatible providers

The model must support the decode_text() and decode_text_batch() operations. The following providers are compatible:

Provider	Description
`completions`	Any OpenAI-compatible chat/completions endpoint
`openai_completions`	OpenAI API (uses default OpenAI URL)
`nim_completions`	NVIDIA NIM completions
`openrouter_chat`	OpenRouter chat
`gemini`	Google Gemini API
`t5_local`	Built-in T5 model (no external endpoint)
`llama_instruct_local`	Built-in Llama model

Embedding-only providers (bert_local, *_embeddings, *_clip, *_reranking) do not support summarization and will return an error if used.

Strategies: append vs reduce

When chunk_config is set, the input is split into chunks before summarization. The strategy parameter controls how those chunk summaries are combined.

Append (default) — each chunk is summarized independently. Summaries are concatenated with newlines into the final result.

Output length scales with the number of chunks
LLM calls: one per chunk
Good for: detailed summaries, meeting notes, comprehensive reviews

Reduce — chunks are summarized with a target output of desired_length / reduction_factor words or characters. The summaries are then re-chunked and re-summarized iteratively until the combined output fits within desired_length.

Output is compressed to a fixed target size regardless of input length
LLM calls: more than append, depends on input size and reduction_factor
Good for: executive summaries, high-level overviews

Note

The reduce strategy can fail if chunk sizes are too small. Use word-based chunking with at least 50 words per chunk for reliable results.

Custom prompts and inference settings

Guide the summarization with a custom prompt:

SELECT aidb.summarize_text(
    input   => 'Long article text...',
    options => aidb.summarize_text_config(
        'my_t5_model',
        prompt => 'Summarize the key points for a non-technical audience'
    )::json
);

Control model behavior at runtime using inference_config:

SELECT * FROM aidb.summarize_text(
    input   => 'Long article text...',
    options => aidb.summarize_text_config(
        'my_llm',
        NULL,
        'Create a brief summary',
        NULL,
        NULL,
        aidb.inference_config(
            temperature => 0.3,
            max_tokens  => 50,
            seed        => 42
        )
    )::json
);

aidb.summarize_text_aggregate()

aidb.summarize_text_aggregate() is a PostgreSQL aggregate function. As the database iterates rows in each group, it accumulates text separated by newlines. Empty and NULL rows are skipped. After all rows are collected, the combined text is sent to the LLM and one summary is returned per GROUP BY group.

If chunk_config is set, the accumulated text is chunked before being sent (see Strategies above).

Basic example

SELECT category,
       aidb.summarize_text_aggregate(
           content,
           '{"model": "my_t5_model"}'::json ORDER BY id
       ) AS summary
FROM my_table
GROUP BY category;

Real-world example: customer feedback by product

SELECT aidb.create_model(
    'my-llm',
    'completions',
    '{"model": "llama3.2:3b", "url": "http://localhost:11434/v1/chat/completions"}'::JSONB
);

SELECT product_name,
       count(feedback_id) AS feedback_count,
       aidb.summarize_text_aggregate(
           feedback_text,
           aidb.summarize_text_config('my-llm')::json
       ) AS summary
FROM customer_feedback
JOIN products ON customer_feedback.product_id = products.product_id
GROUP BY product_name
ORDER BY feedback_count DESC;

Custom prompt: targeted analysis

Change the prompt to extract different insights from the same data — no new pipeline or code needed:

-- What are customers saying?
SELECT product_name,
       aidb.summarize_text_aggregate(
           feedback_text,
           aidb.summarize_text_config('my-llm')::json
       ) AS summary
FROM customer_feedback
JOIN products ON customer_feedback.product_id = products.product_id
GROUP BY product_name;

-- Why are we losing deals?
SELECT p.product_name,
       aidb.summarize_text_aggregate(
           sn.note_text,
           aidb.summarize_text_config(
               'my-llm',
               prompt => 'Analyze these sales notes: why are we losing deals? Provide specific, actionable insights.'
           )::json
       ) AS loss_analysis
FROM sales_notes sn
JOIN sales_orders so ON sn.order_id = so.order_id
JOIN products p ON so.product_id = p.product_id
WHERE so.status IN ('lost', 'closed_lost')
GROUP BY p.product_name;

With chunking

Use chunk_config to handle groups with large amounts of accumulated text:

SELECT category,
       aidb.summarize_text_aggregate(
           content,
           aidb.summarize_text_config(
               'my_t5_model',
               aidb.chunk_text_config(80, 80, 10, 'words')
           )::json ORDER BY id
       ) AS summary
FROM my_table
GROUP BY category;

With reduce strategy

Compress large text groups into a fixed-size executive summary:

SELECT category,
       aidb.summarize_text_aggregate(
           content,
           aidb.summarize_text_config(
               'my_t5_model',
               aidb.chunk_text_config(60, 60, 5, 'words'),
               NULL,
               'reduce',
               5
           )::json ORDER BY id
       ) AS summary
FROM my_table
GROUP BY category;

With inference configuration

SELECT category,
       aidb.summarize_text_aggregate(
           content,
           aidb.summarize_text_config(
               'my_llm',
               NULL,
               'Summarize the key points',
               NULL,
               NULL,
               aidb.inference_config(
                   temperature => 0.2,
                   max_tokens  => 100,
                   top_p       => 0.9,
                   seed        => 42
               )
           )::json ORDER BY id
       ) AS summary
FROM my_table
GROUP BY category;

Model compatibility

Using a provider that does not support decode_text returns an error immediately — at function call time for summarize_text() and at preparer creation time for pipeline use:

-- Register an embedding-only model
SELECT aidb.create_model('bert_model', 'bert_local');

-- Fails at call time
SELECT * FROM aidb.summarize_text(
    input   => 'Hello world',
    options => aidb.summarize_text_config(model => 'bert_model')::json
);

Output

ERROR: The requested adapter is not supported by the model provider: bert_local

Common errors

Error	Cause	Fix
`Model not found: X`	Model name not registered	Run `aidb.create_model()` first; check spelling
`Provider 'bert_local' does not support language operations`	Embedding-only provider used for summarization	Use `completions`, `openai_completions`, `t5_local`, or another language provider
`Invalid parameters: 'desired_length' must be greater than 0`	Invalid chunk config	Use a positive integer for `desired_length`
`Invalid parameters: 'max_length' must be >= 'desired_length'`	`max_length` is smaller than `desired_length`	Set `max_length >= desired_length`
`Invalid parameters: 'overlap_length' must be less than 'desired_length'`	Overlap too large	Reduce `overlap_length` (10–20% of `desired_length` is typical)
`Text summarization failed to reduce size`	`reduce` strategy produced output longer than input	Increase chunk size (use `>= 50 words` or `>= 800 chars`)

Text summarization v7

aidb.summarize_text()

Compatible providers

Strategies: append vs reduce

Note

Custom prompts and inference settings

aidb.summarize_text_aggregate()

Basic example

Real-world example: customer feedback by product

Custom prompt: targeted analysis

With chunking

With reduce strategy

With inference configuration

Model compatibility

Common errors

← Prev

↑ Up

Next →