otari-cli

June 10, 2026 · View on GitHub

otari-cli

Command-line interface for otari, the OpenAI-compatible LLM gateway you own and run yourself.

otari gateway | Python SDK | Documentation

otari-cli is a thin command-line wrapper over the otari Python client SDK. It talks to a self-hosted otari gateway or the hosted platform at otari.ai.

Installation

Requirements

Python 3.11 or newer

Install

pip install otari-cli

This installs the otari console command.

Authentication

otari-cli reads the same environment variables as the otari SDK, so it works in two modes. Flags always override the environment.

Variable	Mode	Purpose
`OTARI_AI_TOKEN`	Platform	Bearer token; base URL defaults to `https://api.otari.ai`.
`GATEWAY_API_BASE`	Self-hosted	Gateway base URL (required for self-hosted).
`GATEWAY_API_KEY`	Self-hosted	Virtual API key (sent via the `Otari-Key` header).
`GATEWAY_ADMIN_KEY`	Either	Admin key for control-plane commands (`keys`, `usage`).

Equivalent flags: --token, --api-base, --api-key, --admin-key.

Usage

# Show help and the available commands
otari --help

# Check that the configured gateway is reachable
otari --api-base http://localhost:8000 health

# List the models the gateway can route to
otari models

# Create a chat completion
otari completion -m openai:gpt-4o-mini "Write a haiku about gateways."

# Stream the response token by token
otari completion -m openai:gpt-4o-mini --stream "Tell me a short story."

# Emit machine-readable JSON instead of formatted output
otari --json models

Generation commands

otari completion -m openai:gpt-4o-mini "Hello"        # chat completions (+ --stream)
otari message -m anthropic:claude-3-5-sonnet "Hello"  # Anthropic-style messages (+ --stream)
otari response -m openai:gpt-4o-mini "Hello"          # Responses API (+ --stream)
otari embedding -m openai:text-embedding-3-small "a sentence"
otari moderation -m openai:omni-moderation-latest "some text"
otari rerank -m cohere:rerank-v3.5 -q "query" "doc one" "doc two"
otari models
otari batches create -m openai:gpt-4o-mini --input requests.jsonl
otari batches list --provider openai
otari batches results <batch-id> --provider openai

The --json and --stream flags compose: with both set, streaming commands emit one JSON event object per chunk (newline-delimited) rather than a single document.

Control-plane commands (self-hosted / admin)

These require an admin credential and a self-hosted gateway:

# Keys
otari keys list
otari keys create --name prod --user u_123 --metadata '{"team": "ml"}'
otari keys update <key-id> --inactive
otari keys delete <key-id>

# Users, budgets, pricing
otari users create u_123 --alias "ML team" --budget b_1
otari budgets create --max-budget 100 --duration-sec 86400
otari pricing set openai:gpt-4o-mini --input-price 0.15 --output-price 0.60

# Usage
otari usage list --user u_123 --start 2026-01-01 --end 2026-01-31
otari users usage u_123

Development

otari-cli uses uv.

uv sync --extra dev      # install with dev dependencies
uv run otari --help      # run the CLI from source
uv run ruff check .      # lint
uv run mypy src/         # type check (strict)
uv run pytest            # tests

See CONTRIBUTING.md and AGENTS.md for the full workflow and conventions.

Commands

Group	Commands
Generation	`completion`, `message`, `response` (each with `--stream`), `embedding`, `moderation`, `rerank`, `models`
Batches	`batches create`, `batches retrieve`, `batches list`, `batches cancel`, `batches results`
Control plane	`keys`, `users`, `budgets`, `pricing` (CRUD), `usage list`, `users usage`
Diagnostics	`health`

Run otari <command> --help for the full options of any command.

License

otari-cli is licensed under the Apache License 2.0. See the LICENSE file for details.