Embedding Model Configuration

June 6, 2026 · View on GitHub

This guide details how to configure the embedding models used for vector search. You can set the embedding model using the app.embeddingModel configuration key, the DOCS_MCP_EMBEDDING_MODEL environment variable, or the --embedding-model CLI flag.

Model Selection

If you leave the model empty but provide OPENAI_API_KEY, the server defaults to text-embedding-3-small.

Supported Options:

  • text-embedding-3-small (default, OpenAI)
  • openai:snowflake-arctic-embed2 (OpenAI-compatible, e.g., Ollama)
  • vertex:text-embedding-004 (Google Vertex AI)
  • gemini:embedding-001 (Google Gemini)
  • aws:amazon.titan-embed-text-v1 (AWS Bedrock)
  • microsoft:text-embedding-ada-002 (Azure OpenAI)
  • Or any OpenAI-compatible model name

Provider Configuration

Provider credentials use the provider-specific environment variables listed below.

VariableDescription
DOCS_MCP_EMBEDDING_MODELEmbedding model to use.
OPENAI_API_KEYOpenAI API key for embeddings.
OPENAI_API_BASECustom OpenAI-compatible API endpoint (e.g., Ollama).
GOOGLE_API_KEYGoogle API key for Gemini embeddings.
GOOGLE_APPLICATION_CREDENTIALSPath to Google service account JSON for Vertex AI.
AWS_ACCESS_KEY_IDAWS key for Bedrock embeddings.
AWS_SECRET_ACCESS_KEYAWS secret for Bedrock embeddings.
AWS_REGIONAWS region for Bedrock.
AZURE_OPENAI_API_KEYAzure OpenAI API key.
AZURE_OPENAI_API_INSTANCE_NAMEAzure OpenAI instance name.
AZURE_OPENAI_API_DEPLOYMENT_NAMEAzure OpenAI deployment name.
AZURE_OPENAI_API_VERSIONAzure OpenAI API version.

Examples

Here are complete configuration examples for different embedding providers.

OpenAI (Default)

OPENAI_API_KEY="sk-proj-your-openai-api-key" \
DOCS_MCP_EMBEDDING_MODEL="text-embedding-3-small" \
npx @arabold/docs-mcp-server@latest

Ollama (Local)

Run local models compatible with the OpenAI API format.

OPENAI_API_KEY="ollama" \
OPENAI_API_BASE="http://localhost:11434/v1" \
DOCS_MCP_EMBEDDING_MODEL="openai:nomic-embed-text" \
npx @arabold/docs-mcp-server@latest

LM Studio (Local)

Connect to LM Studio's local inference server.

OPENAI_API_KEY="lmstudio" \
OPENAI_API_BASE="http://localhost:1234/v1" \
DOCS_MCP_EMBEDDING_MODEL="text-embedding-qwen3-embedding-4b" \
npx @arabold/docs-mcp-server@latest

Google Gemini

Use Google's Gemini API directly.

GOOGLE_API_KEY="your-google-api-key" \
DOCS_MCP_EMBEDDING_MODEL="gemini:embedding-001" \
npx @arabold/docs-mcp-server@latest

Google Vertex AI

For enterprise GCP deployments.

GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/gcp-service-account.json" \
DOCS_MCP_EMBEDDING_MODEL="vertex:text-embedding-004" \
npx @arabold/docs-mcp-server@latest

AWS Bedrock

Use Amazon Titan or other Bedrock-hosted models.

AWS_ACCESS_KEY_ID="your-aws-access-key-id" \
AWS_SECRET_ACCESS_KEY="your-aws-secret-access-key" \
AWS_REGION="us-east-1" \
DOCS_MCP_EMBEDDING_MODEL="aws:amazon.titan-embed-text-v1" \
npx @arabold/docs-mcp-server@latest

Azure OpenAI

Connect to your private Azure OpenAI deployment.

AZURE_OPENAI_API_KEY="your-azure-openai-api-key" \
AZURE_OPENAI_API_INSTANCE_NAME="your-instance-name" \
AZURE_OPENAI_API_DEPLOYMENT_NAME="your-deployment-name" \
AZURE_OPENAI_API_VERSION="2024-02-01" \
DOCS_MCP_EMBEDDING_MODEL="microsoft:text-embedding-ada-002" \
npx @arabold/docs-mcp-server@latest

Changing the Embedding Model

When you change the embedding model or vector dimension after initial setup, existing embedding vectors become semantically incompatible with the new configuration. The server detects this automatically by tracking the active model identity in a metadata table.

What Happens on Model Change

Interactive mode (TTY connected): The server displays a warning and prompts for confirmation before proceeding. Rejecting the prompt aborts startup with no changes made.

⚠️  Embedding model change detected:
   Previous: openai:text-embedding-3-small (1536 dimensions)
   Current:  openai:text-embedding-ada-002 (1536 dimensions)

   All existing embedding vectors will be invalidated.
   Libraries must be re-scraped to restore vector search.
   Full-text search will continue working for all existing documents.

   Proceed with model change? (y/N)

Non-interactive mode (MCP/stdio, CI/CD): The server fails startup entirely with a descriptive error message. To resolve the change, start the server interactively once to confirm the migration.

After Confirming a Model Change

  • All stored embedding vectors are set to NULL
  • The vector search index (documents_vec) is recreated empty with the new dimension
  • Full-text search continues working for all existing documents
  • Libraries must be re-scraped to regenerate embeddings with the new model

Vector Dimension Override

The vector dimension defaults to the model's native dimension (e.g., 1536 for text-embedding-3-small). For unknown OpenAI-compatible models, the server detects the native dimension with a startup probe on first successful initialization, stores that detected size in database metadata, and reuses it on later startups for the same model without probing again. You can override it with embeddings.vectorDimension in the config file or DOCS_MCP_EMBEDDINGS_VECTOR_DIMENSION as an environment variable. The value must be a positive integer (minimum 1).