EDB Docs - EDB Postgres AI v1.4.1 (LTS)

The EDB Model Server component connects a flow to an inference model running on a Hybrid Manager (HM) model server cluster. Use it for chat-style language models, reasoning models, and other inference endpoints that you've deployed to HM.

Choosing the right component

Use the EDB Model Server component when:

You want a language-model response in your flow from a model running on an HM-hosted model server cluster.
You want a LanguageModel object to plug into an agent or downstream chain.
You're using an OpenAI-compatible endpoint (the default) or a native NVIDIA NIM endpoint.

Use a different component if:

You want embeddings rather than text responses. Use EDB Embeddings.
You want CPU-based inference running on the database cluster rather than on a model server. Use EDB Embedded Models.

Prerequisites

An HM model server cluster with the target model deployed. See Models for how to create one. A GPU node is typically required.
An HM machine-user access key saved in Langflow as a Global Variable (default name HM_API_KEY).

Inputs

Prompt

Field	Type	Required	Default	Notes
Input	Message	No		The user message to send to the model. Connect from upstream Chat Input or other text-producing components.
System Message	Multiline text	No		System prompt passed to the model.
Stream	Boolean	No	`false`	Advanced. Stream the model's response tokens as they arrive. Only supported by models that expose a streaming endpoint.

Connection

Field	Type	Required	Default	Notes
Hybrid Manager URL	Text	No		Advanced. Override the default HM URL.
HM Machine User Key	Secret	Yes	`HM_API_KEY`	Defaults to the global variable named `HM_API_KEY`.
Hybrid Manager Model Server Cluster Instance	Dropdown	Yes		The model server cluster to call. Populated from your HM model clusters.
External Ingress	Boolean	No		Advanced. Route through the external ingress instead of the in-cluster service.

Model

Field	Type	Required	Default	Notes
API Client	Dropdown	No	`OpenAI`	Advanced. `OpenAI` is required for correct tool-calling behavior in Agent flows. `NVIDIA` only when connecting to a native NIM endpoint that needs NVIDIA-specific features like detailed thinking.
Model Name	Dropdown	Yes		Populated from the selected model server cluster. Use the refresh button if the list is empty.
Temperature	Slider	No	`0.1`	Controls randomness. Range `0–1`, step `0.01`.
Max Tokens	Integer	No		Advanced. Maximum tokens to generate. Range `0–128000`. Set to `0` for unlimited.
Seed	Integer	No	`1`	Advanced. Controls reproducibility.
Default model query	Text	No		Advanced. Sent when no input is connected to the component. Useful for testing a model that does not accept empty input.

API Client = OpenAI (default)

Field	Type	Required	Default	Notes
Model Kwargs	Dict	No		Advanced. Additional keyword arguments passed to the OpenAI chat client.
JSON Mode	Boolean	No	`false`	Advanced. Force the model to return JSON.
Max Retries	Integer	No	`5`	Advanced. Maximum retries on a failed request.
Timeout	Integer	No	`700`	Advanced. Per-request timeout in seconds.

API Client = NVIDIA

Field	Type	Required	Default	Notes
Detailed Thinking	Boolean	No	`false`	Advanced. Return the model's detailed thought process. Only supported by NVIDIA reasoning models.

The OpenAI-specific fields hide when API Client is NVIDIA, and vice versa. Field values are preserved when you switch between clients.

Outputs

Output	Type	Carries
Model Response	Message	The model's text response to the input prompt.
Language Model	LanguageModel	A LangChain `LanguageModel` instance pointing at the selected HM model server cluster. Pass this into an agent or chain that needs a model handle.

EDB Model Server v1.4.1 (LTS)