Python

July 1, 2026 · View on GitHub

Xberg

Python

Bindings Rust Python Node.js WASM Java Go C# PHP Ruby Elixir Docker Homebrew C FFI License Docs
xberg.io
Join Discord

Universal LLM API client for Python. Access 143 LLM providers through a single unified interface. Native async/await support, streaming responses, tool calling, and type-safe API.

What This Package Provides

  • One provider surface — chat, streaming, embeddings, images, audio, search, OCR, tools, and structured output across the provider registry.
  • Provider/model routing — call models with the provider/model convention and keep provider-specific request code out of application paths.
  • Production controls — retries, fallback, rate limits, cache layers, budgets, health checks, OpenTelemetry spans, and redacted secrets.
  • Same core as every binding — Rust, Python, Node.js, Go, Java, PHP, Ruby, .NET, Elixir, WASM, Kotlin Android, Swift, Dart, Zig, and C FFI use the same Rust implementation.
  • Python package — native async/await, streaming, and typed request/response objects.

Installation

Package Installation

Install via pip:

pip install liter-llm

System Requirements

  • Python 3.10+ required
  • API keys via environment variables (e.g. OPENAI_API_KEY, ANTHROPIC_API_KEY)

Quick Start

Basic Chat

Send a message to any provider using the provider/model prefix:

import asyncio
import os

from liter_llm import create_client
from liter_llm._internal_bindings import ChatCompletionRequest

async def main() -> None:
    client = create_client(api_key=os.environ["OPENAI_API_KEY"])
    request = ChatCompletionRequest.from_json(
        '{"model":"openai/gpt-4o","messages":[{"role":"user","content":"Hello!"}]}'
    )
    response = await client.chat(request)
    print(response.choices[0].message.content)

asyncio.run(main())

Common Use Cases

Streaming Responses

Stream tokens in real time:

import asyncio
import os

from liter_llm import create_client
from liter_llm._internal_bindings import ChatCompletionRequest

async def main() -> None:
    client = create_client(api_key=os.environ["OPENAI_API_KEY"])
    request = ChatCompletionRequest.from_json(
        '{"model":"openai/gpt-4o","messages":[{"role":"user","content":"Tell me a story"}],"stream":true}'
    )
    async for chunk in client.chat_stream(request):
        if chunk.choices and chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="", flush=True)
    print()

asyncio.run(main())

Tool Calling

Define and invoke tools:

import asyncio
import json
import os

from liter_llm import create_client
from liter_llm._internal_bindings import ChatCompletionRequest

REQUEST = {
    "model": "openai/gpt-4o",
    "messages": [{"role": "user", "content": "What is the weather in Berlin?"}],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get the current weather for a location",
                "parameters": {
                    "type": "object",
                    "properties": {"location": {"type": "string"}},
                    "required": ["location"],
                },
            },
        }
    ],
    "tool_choice": "auto",
}

async def main() -> None:
    client = create_client(api_key=os.environ["OPENAI_API_KEY"])
    request = ChatCompletionRequest.from_json(json.dumps(REQUEST))
    response = await client.chat(request)
    for call in response.choices[0].message.tool_calls or []:
        print(f"Tool: {call.function.name}, Args: {call.function.arguments}")

asyncio.run(main())

Next Steps

Features

Supported Providers (143)

Route to any provider using the provider/model prefix convention:

ProviderExample Model
OpenAIopenai/gpt-4o, openai/gpt-4o-mini
Anthropicanthropic/claude-3-5-sonnet-20241022
Groqgroq/llama-3.1-70b-versatile
Mistralmistral/mistral-large-latest
Coherecohere/command-r-plus
Together AItogether/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
Fireworksfireworks/accounts/fireworks/models/llama-v3p1-70b-instruct
Google Vertexvertexai/gemini-1.5-pro
Amazon Bedrockbedrock/anthropic.claude-3-5-sonnet-20241022-v2:0

Complete Provider List

Key Capabilities

  • Provider Routing -- Single client for 143 LLM providers via provider/model prefix
  • Local LLMs — Connect to locally-hosted models via Ollama, LM Studio, vLLM, llama.cpp, and other local inference servers
  • Unified API -- Consistent chat, chat_stream, embeddings, list_models interface
  • Streaming -- Real-time token streaming via chat_stream
  • Tool Calling -- Function calling and tool use across all supporting providers
  • Type Safe -- Schema-driven types compiled from JSON schemas
  • Secure -- API keys never logged or serialized, managed via environment variables
  • Observability -- Built-in OpenTelemetry with GenAI semantic conventions
  • Error Handling -- Structured errors with provider context and retry hints

Performance

Built on a compiled Rust core for speed and safety:

  • Provider resolution at client construction -- zero per-request overhead
  • Configurable timeouts and connection pooling
  • Zero-copy streaming with SSE and AWS EventStream support
  • API keys wrapped in secure memory, zeroed on drop

Provider Routing

Route to 143 providers using the provider/model prefix convention:

openai/gpt-4o
anthropic/claude-3-5-sonnet-20241022
groq/llama-3.1-70b-versatile
mistral/mistral-large-latest

See the provider registry for the full list.

Proxy, MCP Server & Plugin

Run the OpenAI-compatible proxy or the MCP server

Beyond the SDK, the liter-llm CLI ships an OpenAI-compatible proxy and a Model Context Protocol (MCP) server:

brew install xberg-io/tap/liter-llm   # or: cargo install liter-llm-cli
liter-llm api --config liter-llm-proxy.toml   # OpenAI-compatible proxy
liter-llm mcp --transport stdio               # MCP tool server

# or run the proxy without installing:
docker run -p 4000:4000 -e LITER_LLM_MASTER_KEY=sk-your-key ghcr.io/xberg-io/liter-llm

To use the MCP server inside a coding agent, install the liter-llm plugin from the xberg-io/plugins marketplace — it auto-registers the server. See the MCP server and proxy server guides for configuration, CLI usage, and agent integration.

Documentation

Part of Xberg.io

  • Xberg — document intelligence: text, tables, metadata from 91+ formats with optional OCR.
  • Xberg Enterprise — managed extraction API with SDKs, dashboards, and observability.
  • crawlberg — web crawling and scraping with HTML→Markdown and headless-Chrome fallback.
  • html-to-markdown — fast, lossless HTML→Markdown engine.
  • liter-llm — universal LLM API client with native bindings for 14 languages and 143 providers.
  • tree-sitter-language-pack — tree-sitter grammars and code-intelligence primitives.
  • alef — the polyglot binding generator that produces every per-language binding across the 5 polyglot repos.
  • Discord — community, roadmap, announcements.

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

Join our Discord community for questions and discussion.

License

MIT -- see LICENSE for details.