LLM4Free Architecture Overview
June 18, 2026 ยท View on GitHub
Last updated: 2025-12-20
Relates to:llm4free/cli.py,llm4free/client.py,llm4free/server/,llm4free/search/,llm4free/Provider/,llm4free/Extra/
LLM4Free bundles multiple user-facing entry points (CLI, Python client, and an OpenAI-compatible API server) on top of a shared set of engines, providers, and utilities. This document maps how these layers interact so you can reason about changes confidently.
๐งฑ Layered View
flowchart TD
subgraph EntryPoints
CLI[CLI (llm4free/cli.py)]
Client[Python Client (llm4free/client.py)]
Server[OpenAI-Compatible Server (llm4free/server)]
end
subgraph Core
Search[Search Engines (llm4free/search)]
Providers[Providers (llm4free/Provider)]
Extras[Extras & Toolkits (llm4free/Extra)]
Utilities[Utilities (sanitize.py, AIutel.py, etc.)]
Models[Model Registry (llm4free/models.py)]
end
CLI --> Search
CLI --> Providers
Client --> Providers
Client --> Extras
Client --> Models
Server --> Providers
Server --> Utilities
Search --> Providers
Extras --> Providers
- Entry Points convert user intent (commands/API calls) into provider requests.
- Core Modules encapsulate the heavy lifting: crawling websites, calling remote LLMs, handling audio/image generation, sanitizing streams, and enumerating models.
๐ Entry Points
Command Line Interface (llm4free/cli.py)
- Built on
swiftcliwith separate command groups for DuckDuckGo, Yep, Bing, Yahoo, and weather utilities. - Uses
_print_data/_print_weatherhelpers to keep terminal output consistent. - Relies on the same search/provider classes exported in
llm4free/__init__.py, so CLI behavior matches the Python API.
Unified Python Client (llm4free/client.py)
- Provides auto-failover chat and image APIs through
Client.chat.completions.create()andClient.images.generate(). - Dynamically discovers OpenAI-compatible providers (
llm4free/Provider/OPENAI) and TTI providers, caches instances, and performs fuzzy model resolution. - Shares provider cache with the server, so runtime cost of imports stays low.
OpenAI-Compatible Server (llm4free/server/)
- FastAPI app that exposes
/v1/*routes mirroring OpenAI's schema. - Uses
providers.pyto map model names likeProviderName/model-idback to actual provider classes. - Pulls configuration from
config.pyplus environment variables documented indocs/openai-api-server.mdanddocs/DOCKER.md.
๐ Core Modules
Search Stack (llm4free/search/)
- Houses protocol-specific engines (See
llm4free/search/engines/*) plus shared HTTP client and result serializers. - DuckDuckGo/Yep/Bing/Yahoo commands import from here, so adding new CLI options usually starts with an engine update.
Providers (llm4free/Provider/)
- Normal providers live alongside OpenAI-compatible wrappers (
llm4free/Provider/OPENAI). - Specialty directories:
AISEARCH,TTI,TTS,STT,UNFINISHED. - The matrix in
Provider.mdmaps every provider to its implementation file.
Extras (llm4free/Extra/)
- Optional toolkits packaged with LLM4Free (GGUF converter, weather clients, temp mail, YT toolkit, Git API helper, etc.).
- Exported through
llm4free/Extra/__init__.pyso they become part of the public API when youimport llm4free.
Utilities
llm4free/sanitize.pyโ SSE/stream sanitization for server + client streaming paths.llm4free/AIutel.pyโ Decorators for retry/timing (documented indocs/decorators.md).llm4free/update_checker.pyโ Optional PyPI update notifier executed inllm4free/__init__.py.
Models Registry (llm4free/models.py)
- Enumerates LLM, TTS, and TTI models exposed by providers.
- Used by documentation examples (README, docs/models.md) and can power custom tooling (e.g., provider dashboards).
๐ Typical Data Flows
- CLI โ Search Engine โ Provider
llm4free images -k "python"โDuckDuckGoSearch.images()(HTTP scraping) โ results printed via_print_data.
- Client โ Provider Failover
Client().chat.completions.create(model="gpt-4o")โ resolves provider & model โ tries preferred provider โ falls back through fuzzily-matched providers if necessary.
- Server โ Provider โ sanitize_stream
/v1/chat/completionsrequest hits FastAPI โ provider resolved โ streaming responses run throughsanitize_stream()before being sent to clients.
- Extras โ Providers
- GGUF converter uses huggingface + llama.cpp builders and is fully independent, but still exported to users alongside the main modules.
๐งฉ When Adding New Functionality
| Task | Touch Points |
|---|---|
| Add a CLI command | llm4free/cli.py + corresponding engine/provider + update docs/cli.md |
| Add a provider | Implement in llm4free/Provider/ (and optionally OPENAI/), update Provider.md, consider models.py exposure |
| Add server capability | Update llm4free/server/*, document in docs/openai-api-server.md, ensure CLI/Client can hit the new route if needed |
| Extend Extras | Implement under llm4free/Extra/, export in __init__.py, add documentation entry under docs/README.md |
| Add new registry info | Update llm4free/models.py or referencing docs (docs/models.md) |
๐งช Testing & Debugging Hooks
- CLI commands can be run locally with
uv run llm4free ...to ensure option parsing remains correct. - Client failover prints last provider when
print_provider_info=Trueโ useful when debugging provider availability. - The server exposes
/health(see Docker docs) to monitor deployments. sanitize_streamand decorators have dedicated docs you can reference when debugging streaming issues or retries.
๐ Related Documents
- docs/cli.md โ exhaustive CLI reference.
- docs/client.md โ deep dive into the unified client.
- docs/models.md โ using the model registry helpers.
- docs/openai-api-server.md โ server configuration & endpoints.
- Provider.md โ provider matrix you can cross-reference while navigating the codebase.