otari-cli

June 10, 2026 ยท View on GitHub

otari-cli

Python 3.11+ License: Apache 2.0 CI

Command-line interface for otari, the OpenAI-compatible LLM gateway you own and run yourself.

otari gateway | Python SDK | Documentation

otari-cli is a thin command-line wrapper over the otari Python client SDK. It talks to a self-hosted otari gateway or the hosted platform at otari.ai.

Installation

Requirements

  • Python 3.11 or newer

Install

pip install otari-cli

This installs the otari console command.

Authentication

otari-cli reads the same environment variables as the otari SDK, so it works in two modes. Flags always override the environment.

VariableModePurpose
OTARI_AI_TOKENPlatformBearer token; base URL defaults to https://api.otari.ai.
GATEWAY_API_BASESelf-hostedGateway base URL (required for self-hosted).
GATEWAY_API_KEYSelf-hostedVirtual API key (sent via the Otari-Key header).
GATEWAY_ADMIN_KEYEitherAdmin key for control-plane commands (keys, usage).

Equivalent flags: --token, --api-base, --api-key, --admin-key.

Usage

# Show help and the available commands
otari --help

# Check that the configured gateway is reachable
otari --api-base http://localhost:8000 health

# List the models the gateway can route to
otari models

# Create a chat completion
otari completion -m openai:gpt-4o-mini "Write a haiku about gateways."

# Stream the response token by token
otari completion -m openai:gpt-4o-mini --stream "Tell me a short story."

# Emit machine-readable JSON instead of formatted output
otari --json models

Generation commands

otari completion -m openai:gpt-4o-mini "Hello"        # chat completions (+ --stream)
otari message -m anthropic:claude-3-5-sonnet "Hello"  # Anthropic-style messages (+ --stream)
otari response -m openai:gpt-4o-mini "Hello"          # Responses API (+ --stream)
otari embedding -m openai:text-embedding-3-small "a sentence"
otari moderation -m openai:omni-moderation-latest "some text"
otari rerank -m cohere:rerank-v3.5 -q "query" "doc one" "doc two"
otari models
otari batches create -m openai:gpt-4o-mini --input requests.jsonl
otari batches list --provider openai
otari batches results <batch-id> --provider openai

The --json and --stream flags compose: with both set, streaming commands emit one JSON event object per chunk (newline-delimited) rather than a single document.

Control-plane commands (self-hosted / admin)

These require an admin credential and a self-hosted gateway:

# Keys
otari keys list
otari keys create --name prod --user u_123 --metadata '{"team": "ml"}'
otari keys update <key-id> --inactive
otari keys delete <key-id>

# Users, budgets, pricing
otari users create u_123 --alias "ML team" --budget b_1
otari budgets create --max-budget 100 --duration-sec 86400
otari pricing set openai:gpt-4o-mini --input-price 0.15 --output-price 0.60

# Usage
otari usage list --user u_123 --start 2026-01-01 --end 2026-01-31
otari users usage u_123

Development

otari-cli uses uv.

uv sync --extra dev      # install with dev dependencies
uv run otari --help      # run the CLI from source
uv run ruff check .      # lint
uv run mypy src/         # type check (strict)
uv run pytest            # tests

See CONTRIBUTING.md and AGENTS.md for the full workflow and conventions.

Commands

GroupCommands
Generationcompletion, message, response (each with --stream), embedding, moderation, rerank, models
Batchesbatches create, batches retrieve, batches list, batches cancel, batches results
Control planekeys, users, budgets, pricing (CRUD), usage list, users usage
Diagnosticshealth

Run otari <command> --help for the full options of any command.

License

otari-cli is licensed under the Apache License 2.0. See the LICENSE file for details.