2.2.23-Backend-MLX.md
June 3, 2026 ยท View on GitHub
MLX
Handle:
mlx
URL: http://localhost:34930
MLX is Apple's machine-learning framework for Apple Silicon. Harbor integrates it through mlx-lm: the mlx-lm server runs natively on the macOS host for Metal access, while Harbor starts a Caddy proxy container so other services can reach it over the Compose network.
This service is intended for Apple Silicon systems. On other platforms the proxy can still point at an externally managed compatible endpoint, but Harbor cannot provide MLX acceleration inside a Linux container.
Starting
harbor up mlx
When HARBOR_MLX_MANAGE_HOST=true, Harbor automatically starts the mlx-lm server on the host before the proxy comes up. The server is launched via uv run python -m mlx_lm.server from the services/mlx/ workspace. If the configured model is not cached locally, mlx-lm downloads it from HuggingFace on first use.
Start a frontend against MLX:
harbor up webui mlx
Use it with host tools:
harbor launch --backend mlx --model mlx-qwen3.5-4b codex
Stopping
harbor down mlx
When HARBOR_MLX_MANAGE_HOST=true, Harbor stops the host mlx-lm process before stopping the proxy container. harbor mlx stop stops only the host server; use harbor down mlx to stop both the host runner and the proxy.
Configuration
Environment Variables
Following options can be set via harbor config:
# Harbor proxy port
HARBOR_MLX_HOST_PORT 34930
# Proxy image
HARBOR_MLX_IMAGE caddy
HARBOR_MLX_VERSION 2-alpine
# Host workspace and upstream endpoint
HARBOR_MLX_WORKSPACE ./services/mlx
HARBOR_MLX_UPSTREAM_URL http://host.docker.internal:8095
HARBOR_MLX_RUNNER_PORT 8095
# Default model
HARBOR_MLX_MODEL mlx-qwen3.5-4b
HARBOR_MLX_HF_PATH mlx-community/Qwen3.5-4B-4bit
# Host lifecycle
HARBOR_MLX_MANAGE_HOST true
Volumes
The Harbor mlx service mounts only services/mlx/Caddyfile into the proxy container. The host runner uses:
services/mlx/pyproject.toml- project file declaring themlx-lmdependencyservices/mlx/.venv/- virtual environment managed byuvservices/mlx/logs/- host runner log files
Model weights are stored in the HuggingFace cache (~/.cache/huggingface by default).
Model Management
harbor models ls --source mlx
harbor models pull --source mlx mlx-community/Qwen3.5-4B-4bit
The equivalent source-subcommand form is also supported:
harbor models mlx pull mlx-community/Qwen3.5-4B-4bit
harbor mlx pull downloads model repos from HuggingFace into the local cache via hf download. Model removal is not supported through Harbor; manage the HuggingFace cache manually.
To change the default model:
harbor config set mlx.hf.path mlx-community/Qwen3.5-4B-4bit
harbor config set mlx.model mlx-qwen3.5-4b
The mlx-lm server must be restarted after changing the model:
harbor mlx stop
harbor mlx start
API
Harbor exposes MLX at:
http://localhost:34930/v1
Containers use:
http://mlx:8080/v1
The server provides OpenAI-compatible endpoints: /v1/chat/completions, /v1/completions, /v1/models.
Integrations
mlx is exposed to Harbor containers at http://mlx:8080/v1 (integration API key sk-mlx). Harbor wires the same consumer set as oMLX through compose.x.*.mlx.yml overlays, including webui, chatui, aider, boost, litellm, bifrost, optillm, opint, astrbot, cognee, mindsdb, mi, ml-intern, npcsh, open-design, opennotebook, openterminal, anythingllm, cmdh, hermes, plandex, sillytavern, and traefik when enabled.
harbor up webui mlx
Troubleshooting
harbor mlx status
harbor mlx logs
harbor logs mlx
mlx-lm fails to start
Check the host runner log:
harbor mlx logs
Ensure uv is installed and that the services/mlx/ workspace has a valid pyproject.toml. On first run, uv creates a virtual environment and installs mlx-lm automatically.
Proxy is unhealthy
Check the host runner:
harbor mlx status
curl http://localhost:8095/v1/models
If you manage mlx-lm yourself, set:
harbor config set mlx.manage.host false
harbor config set mlx.upstream.url http://host.docker.internal:8095