The EDB Model Server component connects a flow to an inference model running on a Hybrid Manager (HM) model server cluster. Use it for chat-style language models, reasoning models, and other inference endpoints that you've deployed to HM.
Choosing the right component
Use the EDB Model Server component when:
- You want a language-model response in your flow from a model running on an HM-hosted model server cluster.
- You want a
LanguageModelobject to plug into an agent or downstream chain. - You're using an OpenAI-compatible endpoint (the default) or a native NVIDIA NIM endpoint.
Use a different component if:
- You want embeddings rather than text responses. Use EDB Embeddings.
- You want CPU-based inference running on the database cluster rather than on a model server. Use EDB Embedded Models.
Prerequisites
An HM model server cluster with the target model deployed. See Models for how to create one. A GPU node is typically required.
An HM machine-user access key saved in Langflow as a Global Variable (default name
HM_API_KEY).
Inputs
Prompt
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
| Input | Message | No | The user message to send to the model. Connect from upstream Chat Input or other text-producing components. | |
| System Message | Multiline text | No | System prompt passed to the model. | |
| Stream | Boolean | No | false | Advanced. Stream the model's response tokens as they arrive. Only supported by models that expose a streaming endpoint. |
Connection
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
| Hybrid Manager URL | Text | No | Advanced. Override the default HM URL. | |
| HM Machine User Key | Secret | Yes | HM_API_KEY | Defaults to the global variable named HM_API_KEY. |
| Hybrid Manager Model Server Cluster Instance | Dropdown | Yes | The model server cluster to call. Populated from your HM model clusters. | |
| External Ingress | Boolean | No | Advanced. Route through the external ingress instead of the in-cluster service. |
Model
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
| API Client | Dropdown | No | OpenAI | Advanced. OpenAI is required for correct tool-calling behavior in Agent flows. NVIDIA only when connecting to a native NIM endpoint that needs NVIDIA-specific features like detailed thinking. |
| Model Name | Dropdown | Yes | Populated from the selected model server cluster. Use the refresh button if the list is empty. | |
| Temperature | Slider | No | 0.1 | Controls randomness. Range 0–1, step 0.01. |
| Max Tokens | Integer | No | Advanced. Maximum tokens to generate. Range 0–128000. Set to 0 for unlimited. | |
| Seed | Integer | No | 1 | Advanced. Controls reproducibility. |
| Default model query | Text | No | Advanced. Sent when no input is connected to the component. Useful for testing a model that does not accept empty input. |
API Client = OpenAI (default)
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
| Model Kwargs | Dict | No | Advanced. Additional keyword arguments passed to the OpenAI chat client. | |
| JSON Mode | Boolean | No | false | Advanced. Force the model to return JSON. |
| Max Retries | Integer | No | 5 | Advanced. Maximum retries on a failed request. |
| Timeout | Integer | No | 700 | Advanced. Per-request timeout in seconds. |
API Client = NVIDIA
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
| Detailed Thinking | Boolean | No | false | Advanced. Return the model's detailed thought process. Only supported by NVIDIA reasoning models. |
The OpenAI-specific fields hide when API Client is NVIDIA, and vice versa. Field values are preserved when you switch between clients.
Outputs
| Output | Type | Carries |
|---|---|---|
| Model Response | Message | The model's text response to the input prompt. |
| Language Model | LanguageModel | A LangChain LanguageModel instance pointing at the selected HM model server cluster. Pass this into an agent or chain that needs a model handle. |