KaiROS AI
March 9, 2026 ยท View on GitHub
A powerful local AI assistant for Windows & Android
Run LLMs locally on your device โข No cloud required โข Privacy-first
๐ฅ Download
- Download Latest Release - Windows Installer & Android APK
- Microsoft Store - Get it now
- Play Store - ๐ Coming Soon!
- No .NET installation required
- Supports Windows 10/11 (x64) & Android 7.0+
๐ Feature Comparison
| Feature | Windows Desktop | Android Mobile |
|---|---|---|
| Local LLM Inference | โ | โ |
| Model Manager | โ | โ |
| Chat Interface | โ | โ |
| Chat History | โ | โ |
| System Prompt Editing | โ | โ |
| Custom Model Import | โ | โ |
| Markdown Rendering | โ | โ |
| Vision Models Support | โ | โ |
| RAG (Document Chat) | โ | โ |
| Local REST API | โ | โ |
| System Tray Support | โ | โ |
| DirectML & Vulkan | โ | โ |
๐ฅ๏ธ Desktop Version (Windows)
The Desktop version is the full-featured powerhouse, designed for productivity and integration.
Key Features
- RAG (Retrieval Augmented Generation): Chat with your PDF, DOCX, and TXT files locally with a redesigned management interface.
- Local REST API Server: Integrate your local models with VS Code (Continue), LM Studio, or your own apps.
- System Tray Integration: Keep your AI assistant running in the background.
- Advanced GPU Support: Full support for CUDA, DirectML, and Vulkan backends.
- Modern User Experience: Rebuilt with WinUI 3 offering proper dark/light themes, flexible hotkeys (like Shift+Enter vs Enter to send), and quick links to the Microsoft Store and GitHub.
Desktop Screenshots
| Model Catalog | Chat Interface |
|---|---|
![]() | ![]() |
| RAG (Document Chat) | Settings |
|---|---|
![]() | ![]() |
๐ฑ Mobile Version (Android)
The Mobile version brings the power of local AI to your pocket. Optimized for touch and on-the-go usage.
Key Features
- Offline Capable: Run LLMs anywhere, even without an internet connection (after model download).
- Battery Efficient: Optimized for mobile processors.
- Clean UI: A simplified interface focused on chat and quick interactions.
- Chat History: Save and resume your conversations anytime.
Mobile Screenshots
| Chat Interface | Model Selection |
|---|---|
![]() | ![]() |
| Chat History | System Prompt |
|---|---|
![]() | ![]() |
| Settings | |
|---|---|
![]() |
โจ Shared Features
Core Capabilities
- ๐ค Run LLMs Locally - No internet required after model download
- ๏ฟฝ๏ธ Vision Models - Support for multimodal models to chat about images
- ๏ฟฝ๐ฆ Model Catalog - 31 pre-configured models from 9 organizations
- โฌ๏ธ Download Manager - Pause, resume, and manage model downloads
- ๐ฌ Streaming Responses - Real-time text generation
- ๐ Performance Stats - Real-time tokens/sec and memory usage
Model Catalog
- ๐ข Organization Sections - Collapsible groups for Qwen, Google, Meta, Microsoft, and more
- ๐ Advanced Filtering - Filter by Organization, Family, Variant (CPU-Only, GPU-Recommended)
- ๐ท๏ธ Visual Badges - Category, family, variant, and download status indicators
- โ Custom Models - Add your own GGUF models from local files or URLs
Advanced
- ๐จ Modern Dark Theme - Beautiful gradient-based UI design
- ๐ฌ Feedback Hub - Send feedback directly from Settings
๐ Local REST API (Desktop Only)
Build AI-powered applications without cloud dependencies!
KaiROS AI includes a fully local REST API server - perfect for developers who want to integrate local LLMs into their applications.
Quick Start
# Check status
curl http://localhost:5000/health
# Chat (non-streaming)
curl -X POST http://localhost:5000/chat \
-H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":"Hello!"}]}'
enable in Settings โ API Server
๐ Getting Started
Prerequisites
- Windows 10/11 (x64)
- Android 7.0+ (API 24+)
- .NET 9 SDK - Download
- CUDA Toolkit 12 (optional, for GPU acceleration) - Download
Installation
-
Clone the repository
git clone https://github.com/yourusername/KaiROS.AI.git cd KaiROS.AI -
Restore packages and build
dotnet restore dotnet build --configuration Release -
Run the application
dotnet run --project KaiROS.AI
๐ฆ Model Catalog Overview
Supported Organizations
| Organization | Highlights |
|---|---|
| Qwen | Qwen 2.5/3 series (0.5B - 14B) - Excellent multilingual |
| Gemma 2/3 models (270M - 27B) - High quality | |
| Meta | LLaMA 3.1/3.2 + TinyLlama |
| Microsoft | Phi-2, Phi-3, BitNet b1.58 |
| MistralAI | Mistral 7B, Mistral Small 24B |
| Open Source | GPT-oss 20B โ ๏ธ Experimental |
Recommended Models โญ
- Phi-3 Mini 3.8B - Best for general conversations (4 GB RAM)
- Qwen 2.5 3B - Excellent multilingual and coding (4 GB RAM)
- Mistral 7B - Complex reasoning tasks (8 GB RAM)
๐ ๏ธ Tech Stack
- Framework: .NET 9 + WinUI 3 / Windows App SDK (Windows) / MAUI (Android)
- LLM Runtime: LLamaSharp 0.25.0
- MVVM: CommunityToolkit.Mvvm 8.4.0
- GPU Support: CUDA 12, DirectML, Vulkan
- Model Format: GGUF (llama.cpp compatible)
- Database: SQLite (for custom models)
๐ Project Structure
KaiROS.AI/
โโโ Assets/ # App icons and images
โโโ Converters/ # XAML value converters
โโโ Models/ # Data models
โโโ Services/ # Business logic
โโโ Themes/ # UI styling
โโโ ViewModels/ # MVVM ViewModels
โโโ Views/ # XAML views
โโโ appsettings.json # Model catalog config
๐ค Contributing & License
Contributions are welcome! Please feel free to submit a Pull Request. This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- LLamaSharp - Excellent .NET bindings for llama.cpp - This project wouldn't be possible without LLamaSharp!
- llama.cpp - High-performance LLM inference in C/C++
- Hugging Face - Model hosting and community
Made with โค๏ธ for local AI enthusiasts








