๐Ÿš€ Velesio AI Server

January 22, 2026 ยท View on GitHub

High-performance, microservice-based AI inference server with Unity integration support.

Deploy on RunPod

โœจ Features

  • ๐ŸŽฏ Unity Ready: Seamless integration with Unity!
  • ๐Ÿ“ˆ Scalable: Redis queue-based worker architecture
  • ๐Ÿณ Easy Deploy: Docker Compose setup for inference setup, api wrapper, nginx & monitoring
  • ๐Ÿ“Š Monitoring: Grafana template for System, GPU & Application observability

โšก Quick Start

๐ŸŽฎ Unity Integration

Built specifically for Unity developers:

๐Ÿ“š Documentation

๐Ÿ“– Complete Documentation - Full guides, API reference, and examples

๐Ÿ—๏ธ Architecture

Distributed microservice design for maximum flexibility:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚    API      โ”‚โ”€โ”€โ”€โ”€โ”‚  Redis  โ”‚โ”€โ”€โ”€โ”€โ”‚ GPU Workers โ”‚
โ”‚  (FastAPI)  โ”‚    โ”‚ Queue   โ”‚    โ”‚ (LLM + SD)  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
       โ”‚                                  โ”‚
       โ”‚           โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”        โ”‚
       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”‚ Monitoring  โ”‚โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                   โ”‚(Grafana+Prom)โ”‚
                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
  • API Service: FastAPI with token auth and job queuing
  • GPU Workers: Custom llama.cpp + Stable Diffusion inference engines
  • Redis Queue: Decoupled job processing for scalability
  • Monitoring: Pre-configured Grafana dashboards

๐Ÿ“– Learn more: Architecture Documentation


๐Ÿ”Œ Open Source References

Automatic1111 SD Web server LLAMACPP

Questions? Check the Documentation or open an issue!