External inference services Innovation Release

Context

An external inference service connects Hybrid Manager to a remote model provider hosted outside the cluster — such as OpenAI, Google Gemini, Anthropic, or NVIDIA NIM. You configure the provider's URL, model name, and API key once; HM stores the credentials in Kubernetes secrets and handles authentication transparently for every downstream request.

This page covers:

  • Viewing the Inference Services list
  • Registering a new external inference service
  • Getting the details of a registered service
  • Updating a registered service
  • Deregistering a service

Inference Services list

Open the Inference Services list page from the Estate → Inference Services menu in your Hybrid Manager project.

The list displays each service's name, model, and current status.

Status

StatusMeaning
ReadyThe service is healthy and accepting requests.
FailedThe service is unhealthy. Open the service detail to inspect the error.
UnknownHealth check not yet completed — typically seen immediately after creation.

Status is refreshed every 30 seconds by a background health check.

Register an external inference service

Navigate to Estate → Quick Actions → Register External Inference Service in your Hybrid Manager project.

Tip

You can also reach the form from the Inference Services list page via the Quick Actions menu.

Prerequisites

Before registering, confirm you have:

  • The provider's base URL (scheme and host, without a trailing /v1).
  • The model name exactly as the provider expects it (case-sensitive).
  • A valid API key for the provider.
  • Network reachability from the HM cluster to the upstream hostname. Your HM administrator may need to allow egress to the provider's domain.

Form fields

External Service Name (required)

A unique identifier for this service within HM. Must follow DNS-style naming rules:

  • Lowercase letters and digits only.
  • Hyphens (-) are allowed within segments but not at the start or end.
  • Dots (.) are allowed as segment separators.
  • No uppercase letters, underscores, or spaces.
  • Maximum 63 characters.

Example: openai-gpt-4o-mini, azure.gpt-4o.prod.

Tags (optional)

Reuse existing HM tags to group and filter services. Tags have no effect on request routing or authentication.

Model Name (required)

The exact identifier the upstream provider expects, as documented by the provider. This value is case-sensitive.

ProviderExample model name
OpenAIgpt-4o-mini
Google Geminigemini-2.5-pro
Anthropicclaude-sonnet-4-5
NVIDIA NIMmeta/llama-3.1-8b-instruct
OpenRouteropenai/gpt-4o-mini

API Key (required for most providers)

The API key only — do not include the Authorization: Bearer … prefix. HM adds the correct auth header automatically based on the API Protocol Version you select.

Model Base URL (required)

The scheme and host (plus any required path prefix) for the provider's API. Do not include /v1 — consumer applications append /v1 (or /v1beta for Gemini) themselves. Including /v1 here causes duplicated paths such as /v1/v1/chat/completions, which returns a 404.

ProviderModel Base URL
OpenAIhttps://api.openai.com
OpenRouterhttps://openrouter.ai/api
Google Geminihttps://generativelanguage.googleapis.com
Anthropichttps://api.anthropic.com
NVIDIA NIMhttps://integrate.api.nvidia.com
Self-hosted / vLLMYour internal service URL, e.g. http://vllm-svc.inference:8000

Functions (optional, multi-select)

Capability tags that consumer applications filter on when discovering available models. Use the predefined values below for HM's built-in consumers; for your own applications, any string is valid.

Built-in consumerRequired function tag
HM chatbotopenai-chat-completions
AIDB pipeline stepThe matching aidb-* tag (see your AIDB pipeline documentation)

Leave this field empty if you are exposing the service exclusively to custom applications that perform their own model selection.

API Protocol Version (required)

Controls both the request body format and the outbound authentication header. Choose the option that matches the provider's native API.

OptionRequest body shapeAuth header sentUse for
OPENAI_V1OpenAI Chat CompletionsAuthorization: Bearer <key>OpenAI, NVIDIA NIM, vLLM, OpenRouter, any OpenAI-compatible endpoint
GEMINI_V1_BETAGoogle Geminix-goog-api-key: <key>Google Gemini native API only
ANTHROPIC_V1Anthropic Messagesx-api-key: <key> + anthropic-version: 2023-06-01Anthropic Claude

Allow Insecure Connection (optional, default off)

Disables TLS certificate verification on outbound calls to the upstream. Enable this only if the upstream uses a self-signed certificate or a certificate signed by a CA not trusted by the HM cluster.

Warning

This setting is create-only. You cannot toggle it after registration. If you need to change it, delete the service and re-register. Only enable this for development environments or trusted self-signed certificates — disabling TLS verification reduces security.

After clicking Register

HM validates the endpoint before creating any infrastructure. For OpenAI (OPENAI_V1), Anthropic (ANTHROPIC_V1), and Google Gemini (GEMINI_V1_BETA), HM performs a live connectivity probe that checks both reachability and credential validity. If the endpoint is unreachable or the API key is rejected, registration fails immediately with an error — no resources are created.

Note

For some OPENAI_V1 providers — such as NVIDIA NIM, HuggingFace, and OpenRouter — the models endpoint does not require authentication. A connectivity probe is still performed, but a wrong API key may still return HTTP 200. Key validity is not guaranteed at registration time for these providers.

The status displayed (Ready, Failed) is refreshed every 30 seconds by a background health check.

Use the service

Once the service is ready, it is available to:

  • HM chatbot — the chatbot picks up services tagged with openai-chat-completions automatically.
  • Pipeline Designer — registered external models appear in the model picker alongside HM-hosted models. For details, see External inference services in Pipeline Designer.
  • Gen AI Builder — models are available as inference targets in Gen AI Builder pipelines once registered.

Retrieve inference service details

Click a service name in the Inference Services list to open its detail view, which shows the service's configuration, current status, and available actions.

Details

FieldDescription
External Service NameThe unique identifier assigned at registration.
Model NameThe model identifier forwarded to the upstream provider.
Model Base URLThe upstream endpoint the proxy routes requests to.
API Protocol VersionThe request format and authentication header in use (OPENAI_V1, GEMINI_V1_BETA, or ANTHROPIC_V1).
FunctionsThe capability tags currently assigned to the service.
Allow Insecure ConnectionWhether TLS certificate verification is disabled for outbound calls.
StatusCurrent health of the service: Ready, Failed, or Unknown.

Update inference service parameters

To edit a registered service, either open the service detail page and select Quick Actions → Edit Service, or click the pencil icon on the Inference Services list.

Editable fields

The following fields can be updated without deleting and re-registering:

  • Functions — add or remove capability tags at any time.
  • API Protocol Version — change the request format and auth header, for example if you migrate to a different provider API.
  • API Key — replace the key at any time. HM deletes the existing Kubernetes secret and creates a new one automatically.
Note

HM runs a connectivity probe before applying the update. If the endpoint is unreachable or the new API key is rejected, the update fails and no changes are applied.

Locked fields

The following fields are locked after registration and cannot be updated. Delete and re-register the service to change them:

  • External Service Name
  • Model Name
  • Model Base URL
  • Allow Insecure Connection

De-register an external inference service

Warning

Deregister is permanent. All associated Kubernetes resources (namespace, secret, ServingRuntime, InferenceService) are removed immediately. This action cannot be undone.

How to deregister

To delete a service, either open the service detail page and select Quick Actions → Deregister External Inference Service, or click the trash icon on the Inference Services list.

HM blocks deletion if the service is currently referenced by one or more pipelines. Remove or update those pipelines first, then retry.

When deletion succeeds, HM:

  1. Removes all Kubernetes resources backing the service (including the API key secret).
  2. Removes the service record from the database.
  3. Clears all tags associated with the service.