LLM4Free Architecture Overview

June 18, 2026 ยท View on GitHub

Last updated: 2025-12-20
Relates to: llm4free/cli.py, llm4free/client.py, llm4free/server/, llm4free/search/, llm4free/Provider/, llm4free/Extra/

LLM4Free bundles multiple user-facing entry points (CLI, Python client, and an OpenAI-compatible API server) on top of a shared set of engines, providers, and utilities. This document maps how these layers interact so you can reason about changes confidently.

๐Ÿงฑ Layered View

flowchart TD
    subgraph EntryPoints
        CLI[CLI (llm4free/cli.py)]
        Client[Python Client (llm4free/client.py)]
        Server[OpenAI-Compatible Server (llm4free/server)]
    end

    subgraph Core
        Search[Search Engines (llm4free/search)]
        Providers[Providers (llm4free/Provider)]
        Extras[Extras & Toolkits (llm4free/Extra)]
        Utilities[Utilities (sanitize.py, AIutel.py, etc.)]
        Models[Model Registry (llm4free/models.py)]
    end

    CLI --> Search
    CLI --> Providers
    Client --> Providers
    Client --> Extras
    Client --> Models
    Server --> Providers
    Server --> Utilities
    Search --> Providers
    Extras --> Providers
  • Entry Points convert user intent (commands/API calls) into provider requests.
  • Core Modules encapsulate the heavy lifting: crawling websites, calling remote LLMs, handling audio/image generation, sanitizing streams, and enumerating models.

๐Ÿ”Œ Entry Points

Command Line Interface (llm4free/cli.py)

  • Built on swiftcli with separate command groups for DuckDuckGo, Yep, Bing, Yahoo, and weather utilities.
  • Uses _print_data / _print_weather helpers to keep terminal output consistent.
  • Relies on the same search/provider classes exported in llm4free/__init__.py, so CLI behavior matches the Python API.

Unified Python Client (llm4free/client.py)

  • Provides auto-failover chat and image APIs through Client.chat.completions.create() and Client.images.generate().
  • Dynamically discovers OpenAI-compatible providers (llm4free/Provider/OPENAI) and TTI providers, caches instances, and performs fuzzy model resolution.
  • Shares provider cache with the server, so runtime cost of imports stays low.

OpenAI-Compatible Server (llm4free/server/)

  • FastAPI app that exposes /v1/* routes mirroring OpenAI's schema.
  • Uses providers.py to map model names like ProviderName/model-id back to actual provider classes.
  • Pulls configuration from config.py plus environment variables documented in docs/openai-api-server.md and docs/DOCKER.md.

๐Ÿ” Core Modules

Search Stack (llm4free/search/)

  • Houses protocol-specific engines (See llm4free/search/engines/*) plus shared HTTP client and result serializers.
  • DuckDuckGo/Yep/Bing/Yahoo commands import from here, so adding new CLI options usually starts with an engine update.

Providers (llm4free/Provider/)

  • Normal providers live alongside OpenAI-compatible wrappers (llm4free/Provider/OPENAI).
  • Specialty directories: AISEARCH, TTI, TTS, STT, UNFINISHED.
  • The matrix in Provider.md maps every provider to its implementation file.

Extras (llm4free/Extra/)

  • Optional toolkits packaged with LLM4Free (GGUF converter, weather clients, temp mail, YT toolkit, Git API helper, etc.).
  • Exported through llm4free/Extra/__init__.py so they become part of the public API when you import llm4free.

Utilities

  • llm4free/sanitize.py โ€“ SSE/stream sanitization for server + client streaming paths.
  • llm4free/AIutel.py โ€“ Decorators for retry/timing (documented in docs/decorators.md).
  • llm4free/update_checker.py โ€“ Optional PyPI update notifier executed in llm4free/__init__.py.

Models Registry (llm4free/models.py)

  • Enumerates LLM, TTS, and TTI models exposed by providers.
  • Used by documentation examples (README, docs/models.md) and can power custom tooling (e.g., provider dashboards).

๐Ÿ”„ Typical Data Flows

  1. CLI โžœ Search Engine โžœ Provider
    • llm4free images -k "python" โ†’ DuckDuckGoSearch.images() (HTTP scraping) โ†’ results printed via _print_data.
  2. Client โžœ Provider Failover
    • Client().chat.completions.create(model="gpt-4o") โ†’ resolves provider & model โ†’ tries preferred provider โ†’ falls back through fuzzily-matched providers if necessary.
  3. Server โžœ Provider โžœ sanitize_stream
    • /v1/chat/completions request hits FastAPI โ†’ provider resolved โ†’ streaming responses run through sanitize_stream() before being sent to clients.
  4. Extras โžœ Providers
    • GGUF converter uses huggingface + llama.cpp builders and is fully independent, but still exported to users alongside the main modules.

๐Ÿงฉ When Adding New Functionality

TaskTouch Points
Add a CLI commandllm4free/cli.py + corresponding engine/provider + update docs/cli.md
Add a providerImplement in llm4free/Provider/ (and optionally OPENAI/), update Provider.md, consider models.py exposure
Add server capabilityUpdate llm4free/server/*, document in docs/openai-api-server.md, ensure CLI/Client can hit the new route if needed
Extend ExtrasImplement under llm4free/Extra/, export in __init__.py, add documentation entry under docs/README.md
Add new registry infoUpdate llm4free/models.py or referencing docs (docs/models.md)

๐Ÿงช Testing & Debugging Hooks

  • CLI commands can be run locally with uv run llm4free ... to ensure option parsing remains correct.
  • Client failover prints last provider when print_provider_info=True โ€“ useful when debugging provider availability.
  • The server exposes /health (see Docker docs) to monitor deployments.
  • sanitize_stream and decorators have dedicated docs you can reference when debugging streaming issues or retries.