EDB Docs - EDB Postgres AI v1.3.9 (LTS) - How-To Deploy AI Models from Model Library

Prerequisite: Access to the Hybrid Manager UI with AI Factory enabled. See AI Factory in Hybrid Manager.

This guide explains how to deploy AI models from the AI Factory Model Library into Model Serving (powered by KServe) in your Hybrid Manager (HM) environment.

Once deployed, these models power key AI Factory features:

Knowledge Bases (via AIDB pipelines)
Gen AI Builder Assistants and pipelines
Other AI Factory and application integrations

Who should use this guide?

AI platform admins deploying validated model images
Data engineers configuring AI models for Knowledge Bases
AI application developers configuring models for Assistants

What this enables

Once deployed:

Your AI models are available in Model Serving.
You can link them to Knowledge Bases or Gen AI Builder pipelines.
You can monitor and manage deployed models via the HM Model Serving UI or Kubernetes.

Estimated time to complete

10–20 minutes per model, depending on model size and cluster resources.

Prerequisites

Before you begin:

An active HM environment with GPU worker nodes configured.
build.nvidia.com API key stored as nvidia-nim-secrets in Kubernetes.
The model image you want to deploy must be visible in the Model Library.
KServe must be configured and ready in your cluster.

→ For a full setup guide, see: Setup GPU Resources for Model Serving

Steps to deploy an AI model

1. Browse and select model in Model Library

Go to AI Factory > Model Library.
Browse available model images.
Review versions and tags.
Select the version you want to deploy.

2. Configure and deploy model

Click Deploy or Deploy to Model Serving.
Configure deployment parameters:
- Number of replicas
- Resource requests/limits (GPU, CPU, memory)
- Model runtime settings (if needed)
Confirm and deploy.

This triggers creation of:

A ClusterServingRuntime for the model (if not already defined).
An InferenceService for the specific deployment.

3. Verify deployed model

You can verify your deployed models using:

Model Serving UI in HM

Go to AI Factory > Model Serving (or Hybrid Manager > AI Factory > Model Serving).
Confirm model appears with status Ready.

Or use kubectl:

kubectl get InferenceService -n default

Output

NAME STATUS URL GPUs
meta-nim-llama-3-3-nemotron-super-49b Ready http://meta-nim-llama-3-3-nemotron-super-49b-predictor... 4
nim-snowflake-arctic-embed-l Ready http://nim-snowflake-arctic-embed-l-predictor... 1
...

4. Connect model to AI Factory workloads

Once the model is Ready:

You can select it in:
- Knowledge Base pipelines (for embedding or reranking)
- Gen AI Builder pipelines
- Assistant configurations

→ The UI will show models available for each use case based on their type (Embedding, Completion, Reranking, etc.).

Supported model types

Model Type	Example Image
Text Completion	llama-3.3-nemotron-super-49b
Text Embedding	arctic-embed-l
Image Embedding	nvclip
OCR	paddleocr
Text Reranker	llama-3.2-nv-rerankqa-1b-v2

Tips & Best Practices

GPU placement: Ensure your model matches your GPU capacity. Large models like llama-3.3-49b require multiple GPUs on a single node.
Quota management: Limit number of large models deployed simultaneously to avoid overloading GPU nodes.
Version testing: Test new model versions in isolated deployments before promoting to production pipelines or Assistants.

Troubleshooting

Model stuck in Pending

Check GPU node taints/labels.
Verify InferenceService tolerations and nodeSelectors match.

Model not appearing in Model Library

Confirm image is correctly tagged and synced via Image and Model Library.
Verify repository rules if using private registry.

Kubernetes errors on deploy

Check kubectl describe InferenceService <model> for detailed error logs.

Summary

You can deploy AI models from the AI Factory Model Library.
Deployed models run via KServe Model Serving.
Deployed models power Knowledge Bases and Gen AI Builder Assistants.
The deployment flow ensures consistent governance and visibility.

How-To Deploy AI Models from Model Library v1.3.9 (LTS)

Who should use this guide?

What this enables

Estimated time to complete

Prerequisites

Steps to deploy an AI model

1. Browse and select model in Model Library

2. Configure and deploy model

3. Verify deployed model

4. Connect model to AI Factory workloads

Supported model types

Tips & Best Practices

Troubleshooting

Model stuck in Pending

Model not appearing in Model Library

Kubernetes errors on deploy

Summary

↑ Up

Next →

How-To Deploy AI Models from Model Library v1.3.9 (LTS)

Who should use this guide?

What this enables

Estimated time to complete

Prerequisites

Steps to deploy an AI model

1. Browse and select model in Model Library

2. Configure and deploy model

3. Verify deployed model

4. Connect model to AI Factory workloads

Supported model types

Tips & Best Practices

Troubleshooting

Model stuck in Pending

Model not appearing in Model Library

Kubernetes errors on deploy

Summary

Related Links

↑ Up

Next →