otari-cli
June 10, 2026 ยท View on GitHub
otari-cli
Command-line interface for otari, the OpenAI-compatible LLM gateway you own and run yourself.
otari-cli is a thin command-line wrapper over the
otariPython client SDK. It talks to a self-hosted otari gateway or the hosted platform at otari.ai.
Installation
Requirements
- Python 3.11 or newer
Install
pip install otari-cli
This installs the otari console command.
Authentication
otari-cli reads the same environment variables as the otari SDK, so it works in two modes. Flags always override the environment.
| Variable | Mode | Purpose |
|---|---|---|
OTARI_AI_TOKEN | Platform | Bearer token; base URL defaults to https://api.otari.ai. |
GATEWAY_API_BASE | Self-hosted | Gateway base URL (required for self-hosted). |
GATEWAY_API_KEY | Self-hosted | Virtual API key (sent via the Otari-Key header). |
GATEWAY_ADMIN_KEY | Either | Admin key for control-plane commands (keys, usage). |
Equivalent flags: --token, --api-base, --api-key, --admin-key.
Usage
# Show help and the available commands
otari --help
# Check that the configured gateway is reachable
otari --api-base http://localhost:8000 health
# List the models the gateway can route to
otari models
# Create a chat completion
otari completion -m openai:gpt-4o-mini "Write a haiku about gateways."
# Stream the response token by token
otari completion -m openai:gpt-4o-mini --stream "Tell me a short story."
# Emit machine-readable JSON instead of formatted output
otari --json models
Generation commands
otari completion -m openai:gpt-4o-mini "Hello" # chat completions (+ --stream)
otari message -m anthropic:claude-3-5-sonnet "Hello" # Anthropic-style messages (+ --stream)
otari response -m openai:gpt-4o-mini "Hello" # Responses API (+ --stream)
otari embedding -m openai:text-embedding-3-small "a sentence"
otari moderation -m openai:omni-moderation-latest "some text"
otari rerank -m cohere:rerank-v3.5 -q "query" "doc one" "doc two"
otari models
otari batches create -m openai:gpt-4o-mini --input requests.jsonl
otari batches list --provider openai
otari batches results <batch-id> --provider openai
The --json and --stream flags compose: with both set, streaming commands emit
one JSON event object per chunk (newline-delimited) rather than a single document.
Control-plane commands (self-hosted / admin)
These require an admin credential and a self-hosted gateway:
# Keys
otari keys list
otari keys create --name prod --user u_123 --metadata '{"team": "ml"}'
otari keys update <key-id> --inactive
otari keys delete <key-id>
# Users, budgets, pricing
otari users create u_123 --alias "ML team" --budget b_1
otari budgets create --max-budget 100 --duration-sec 86400
otari pricing set openai:gpt-4o-mini --input-price 0.15 --output-price 0.60
# Usage
otari usage list --user u_123 --start 2026-01-01 --end 2026-01-31
otari users usage u_123
Development
otari-cli uses uv.
uv sync --extra dev # install with dev dependencies
uv run otari --help # run the CLI from source
uv run ruff check . # lint
uv run mypy src/ # type check (strict)
uv run pytest # tests
See CONTRIBUTING.md and AGENTS.md for the full workflow and conventions.
Commands
| Group | Commands |
|---|---|
| Generation | completion, message, response (each with --stream), embedding, moderation, rerank, models |
| Batches | batches create, batches retrieve, batches list, batches cancel, batches results |
| Control plane | keys, users, budgets, pricing (CRUD), usage list, users usage |
| Diagnostics | health |
Run otari <command> --help for the full options of any command.
License
otari-cli is licensed under the Apache License 2.0. See the LICENSE file for details.