AI Runner

June 6, 2026 ยท View on GitHub

Edge AI inference engine with a web GUI โ€” LLMs, image generation, voice chat, and agents running entirely on your hardware, at the edge.

License: MIT Python 3.13+ GitHub Stars

๐Ÿž Report Bug ยท โœจ Request Feature ยท ๐Ÿ›ก๏ธ Report Vulnerability ยท ๐Ÿ“– Wiki


What is AI Runner?

AI Runner is a privacy-first edge AI platform โ€” all inference runs locally on your own hardware, not in the cloud. It runs a Python backend that handles model inference and exposes a REST API, paired with a React web frontend you access in your browser. Your prompts, images, and voice data never leave your machine.

Architecture at a glance:

client/                โ† React + Vite frontend (port 5173)
server/src/            โ† Python inference backend (port 8080)

โœจ Features

FeatureDescription
๐Ÿค– LLM ChatLocal LLMs via llama.cpp (GGUF), with optional OpenRouter/OpenAI backends
๐Ÿ—ฃ๏ธ Voice ChatReal-time speech-to-text and text-to-speech for hands-free conversations
๐ŸŽจ Image GenerationStable Diffusion (SD 1.5, SDXL) and FLUX with LoRA and inpainting
๐Ÿง  AI AgentsConfigurable personalities, moods, RAG-enhanced memory, and tool use
๐Ÿ”’ Privacy FirstRuns fully offline by default โ€” no data leaves your machine
๐ŸŒ Web UIReact frontend, accessible from any browser on your local network
โšก OptimizedGGUF quantization, attention slicing, and VRAM offloading for lower-end hardware

โš™๏ธ System Requirements

MinimumRecommended
OSUbuntu 22.04Ubuntu 24.04
CPURyzen 2700K / i7-8700KRyzen 5800X / i7-11700K
RAM16 GB32 GB
GPUNVIDIA RTX 3060NVIDIA RTX 4080+
Storage22 GB SSD100 GB+ SSD
Python3.13.3+3.13.3+

๐Ÿš€ Quick Start

Install

Clone the repo and run the install script:

git clone https://github.com/Capsize-Games/airunner.git
cd airunner
./scripts/install.sh

This installs the Python backend and all frontend dependencies.

Run

./scripts/run_web.sh

Then open your browser at http://localhost:5173.

The backend API is available at http://localhost:8080.

Logs

All server and runtime (art, TTS, STT, LLM) logs go to a single file:

tail -f build/logs/server.log

๐Ÿ“ฆ End-User Bundle (Desktop Application)

For non-developer users, AI Runner provides a self-contained desktop application via Electron. The bundle includes an embedded Python runtime, all Python dependencies, CUDA-accelerated llama.cpp and whisper.cpp binaries, and the compiled React frontend โ€” all in a single installable package.

No Python, Node.js, CMake, C++ compiler, or CUDA toolkit is required. Only an NVIDIA GPU driver (525+) is needed.

Platforms

PlatformInstaller FormatGPU
Linux.AppImage, .debNVIDIA (CUDA, Ampere+)
Windows.exe (NSIS)NVIDIA (CUDA, Ampere+)

Download

Pre-built installers are attached to each GitHub Release tagged with a v* version. Look for artifacts named:

  • airunner-bundle-linux-*.AppImage or airunner-bundle-linux-*.deb
  • airunner-bundle-win32-*.exe

How it works

Electron app
โ”œโ”€โ”€ main process (Node.js)
โ”‚   โ”œโ”€โ”€ Spawns the embedded Python backend as a child process
โ”‚   โ”œโ”€โ”€ Polls GET /health until the backend is ready
โ”‚   โ”œโ”€โ”€ Loads the React frontend once the backend is healthy
โ”‚   โ””โ”€โ”€ Kills the backend on app quit
โ””โ”€โ”€ renderer process
    โ””โ”€โ”€ Loads http://localhost:8080 (served by the Python backend)

electron/resources/
โ”œโ”€โ”€ python/     โ† embedded CPython 3.13 + all pip dependencies + CUDA native libs
โ””โ”€โ”€ web/        โ† compiled React frontend (client/dist/)

Building from source (for maintainers)

# Linux
./package/build_bundle.sh

# Windows (PowerShell)
.\package\build_bundle.ps1

Prerequisites on the build host: CUDA toolkit 12.x, CMake โ‰ฅ 3.24, Node.js โ‰ฅ 20, and a C++ compiler. These are not required on the end user's machine.


๐Ÿ’พ Manual Installation (Advanced)

If you need fine-grained control, the install script supports three modes:

# Developer mode โ€” installs from source (default for contributors)
./scripts/install.sh

# Distributed mode โ€” for server/multi-machine deployments
./deployment/install_distributed.sh

# Single-package mode โ€” installs a prebuilt self-contained bundle
./package/build_bundle.sh

Python dependencies

Python 3.13.3+ is required. We recommend pyenv + venv.

Install PyTorch first:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128

Then install the backend package:

pip install -e "server/src/.[core,llm-native,stt-native,art-python,tts-python]"

llama-cpp-python (CUDA build)

CMAKE_ARGS="-DGGML_CUDA=on -DGGML_CUDA_ARCHITECTURES=90" FORCE_CMAKE=1 \
  pip install --no-binary=:all: --no-cache-dir "llama-cpp-python==0.3.21"

90 targets RTX 4090/5080-class GPUs. Drop -DGGML_CUDA_ARCHITECTURES to auto-detect your GPU.


๐Ÿค– Models

Essential TTS/STT models download automatically on first run. LLM and image models must be configured manually.

CategoryModelSize
LLM (default)Ministral-8B-Instruct (GGUF)~4 GB
ImageStable Diffusion 1.5~2 GB
ImageSDXL 1.0~6 GB
ImageFLUX.1 Dev/Schnell (GGUF)8โ€“12 GB
TTSOpenVoice654 MB
STTWhisper Tiny155 MB

Place art models in ~/.local/share/airunner/art/models/.


๐Ÿ”’ HTTPS

The local server uses HTTPS by default. Certificates are auto-generated at ~/.local/share/airunner/certs/.

For browser-trusted certificates, install mkcert:

sudo apt install libnss3-tools
mkcert -install
airunner-generate-cert

๐Ÿงช Testing

# Run the full test suite
airunner-tests

# Run daemon-safe tests directly
pytest server/src/

# With coverage
airunner-test-coverage-report

โš–๏ธ Colorado AI Act Notice

Effective February 1, 2026, the Colorado AI Act (SB 24-205) regulates high-risk AI systems. If you use AI Runner to make decisions with legal or significant effects on individuals (employment screening, loan eligibility, housing, etc.), you may be classified as a deployer of a high-risk AI system and subject to compliance obligations.

AI Runner is designed to run fully locally with no external data transmission by default. Optional features that do connect externally: model downloads (HuggingFace/CivitAI), web search (DuckDuckGo), weather prompts (Open-Meteo), and external LLM providers (OpenRouter/OpenAI) if configured. We recommend using a VPN when using these features.


๐Ÿค Contributing

See CONTRIBUTING.md and the Development Wiki.

๐Ÿ“š Documentation


License

MIT License โ€” see LICENSE for details.

AI Runner