Enabling a self-hosted model for the Migration Portal AI Copilot v1.3.2

You can use a self-hosted AI Factory model to serve the AI Copilot. This example uses NVIDIA NIM to serve the requests and Llama 3 to process it and generate an answer.

Warning

There are significant safety implications to consider when using self-hosted models with Migration Portal AI Copilot.

The models provided by third-party vendors like OpenAI amd Azure OpenAI include content filtering and other safeguards that are designed to reduce the risk of the model responding to, generating, or contributing to unsafe content. When you use self-hosted models, these protections aren't present.

In addition, because you're hosting the models, you bear responsibility for the risks and potential liability associated with any unsafe behavior.

Prerequisites

Prepare the resources your environment requires to deploy the Migration Portal AI Copilot with a self-hosted solution:

  • You have administrative access to the HM environment.

  • Your organization has created a chat completion and a text embeddings model with the HM AI Factory. They have provided the endpoints for each model, which you can set as environment variables.

    export COMPLETIONS_SVC=llama-3-3-nemotron-super-49b-v1
    export EMBEDDINGS_SVC=llama-3-2-nv-embedqa-1b-v2
    
    export COMPLETIONS_ENDPOINT=$(kubectl get inferenceservice $COMPLETIONS_SVC -o     jsonpath='{.status.url}')
    export EMBEDDINGS_ENDPOINT=$(kubectl get inferenceservice $EMBEDDINGS_SVC -o     jsonpath='{.status.url}')

Enabling the AI Copilot

  1. Check if the edb-migration-copilot namespace exists:

    kubectl get namespaces edb-migration-copilot

    The namespace is created during HM installation. If you're enabling the AI Copilot before installing HM, you must create the namespace in advance.

  2. If the edb-migration-copilot namespace doesn't exist yet, create it:

    kubectl create ns edb-migration-copilot
  3. Set the following environment variables to link the secret with the model endpoints:

    export OPENAI_API_BASE=${COMPLETIONS_ENDPOINT}/v1
    export OPENAI_EMBEDDINGS_API_BASE=${EMBEDDINGS_ENDPOINT}/v1
    export OPENAI_API_KEY=<openai api key> # set to a placeholder value like `noop` if models are deployed in a way that no key is required
    Note

    The AI Copilot uses OpenAI-compatible APIs to communicate with all models, including self-hosted ones. That's why some configuration parameters contain openai in their names, even when you're using a different model to serve queries.

  4. Create the ai-vendor-secrets secret and configure it to point at the models' endpoints:

    kubectl create secret generic ai-vendor-secrets \
        --namespace=edb-migration-copilot \
        --type=opaque \
        --from-literal=AI_VENDOR=NIM \
        --from-literal=RAGCHEW_OPENAI_API_BASE="${OPENAI_API_BASE}" \
        --from-literal=RAGCHEW_OPENAI_EMBEDDINGS_API_BASE="$    {OPENAI_EMBEDDINGS_API_BASE}" \
        --from-literal=OPENAI_API_KEY="${OPENAI_API_KEY}"
  5. Create a file called migration-portal-values.yaml with the following Helm value to override the default AI vendor secrets with the secret created in the previous step.

    parameters:
      edb-migration-copilot:
        ai_vendor_secrets: ai-vendor-secrets
  6. Update the HM installation file to include the AI Copilot configuration. Either update the YAML values you used for installation or run the helm upgrade command with the AI Copilot configuration parameters.

  7. Restart the edb-migration-copilot services to trigger a reconciliation of the new values with the system.

    kubectl rollout restart edb-migration-copilot -n edb-migration-copilot

Additional configuration for air-gapped installations (experimental)

When running in an air-gapped environment, the Migration Portal AI Copilot will fail when it tries to fetch the pretrained tokenizer data from Hugging Face Hub.

Set the following parameter to use a local snapshot of tokenizer data instead:

Restart the edb-migration-copilot services.

Important

The Migration Portal AI Copilot ships with tokenizer data only for the nvidia/llama-3.3-nemotron-super-49b-v1 pretrained tokenizer. Using airgapped_mode: '"true"' with tokenizer_model set to any other model will result in Migration Portal AI Copilot failing.