AutoEnv - AI Game Skin Generator

December 14, 2025 · View on GitHub

Part of VR-Bench Project

Generate themed 2D game skins for maze-type games using AI image generation with automatic visual description.

✨ Features

🎮 4 Game Types: maze, pathfinder, sokoban, trapfield
🎨 Custom Themes: Any visual style you can describe
🤖 AI Visual Description: Automatic skin description using vision models
⚡ Async Pipeline: Parallel asset generation with DAG architecture
🎯 Style Consistency: Style Anchor mechanism ensures unified look
💰 Cost Tracking: Real-time API usage and cost monitoring
🖼️ Auto Processing: Background removal and transparent PNG output

🚀 Quick Start

1. Install Dependencies

All dependencies are managed in VR-Bench's root requirements.txt:

cd /data/yc/VR-Bench
pip install -r requirements.txt

2. Configure API

Edit /data/yc/VR-Bench/.env file:

# Image Generation API (for AutoEnv skin generation)
IMAGE_GEN_API_KEY=your_api_key_here
IMAGE_GEN_BASE_URL=https://api.openai.com/v1
IMAGE_GEN_MODEL=gemini-2.5-flash-image

💡 Note: AutoEnv uses VR-Bench's unified .env configuration. Environment variables are automatically loaded.

3. Generate Skins

cd /data/yc/VR-Bench/AutoEnv
python run_skin_generation.py \
  --maze-type maze \
  --theme "cyberpunk neon city"

🎮 Game Types

Type	Components	Description
`maze`	4	player, goal, wall, floor
`pathfinder`	3	start, end, road
`sokoban`	5	player, goal, box, wall, floor
`trapfield`	4	player, goal, trap, floor

📦 Output Structure

workspace/envs/<game_type>_<timestamp>/
├── analysis.json          # Game configuration analysis
├── strategy.json          # Asset generation strategy
└── skins/                 # Generated skin assets
    ├── wall.png          # Transparent PNG (auto-cropped)
    ├── floor.png
    ├── player.png
    ├── target.png
    └── description.json  # AI-generated visual descriptions

description.json Example

{
  "game_type": "maze",
  "skin_id": "20231214_130944",
  "visual_description": {
    "player": "white rabbit",
    "goal": "orange carrots",
    "wall": "gray rock",
    "floor": "green grass tiles"
  }
}

🤖 AI-Powered: Visual descriptions are automatically generated by analyzing the actual skin images using vision models.

💡 Examples

# Medieval castle maze
python run_skin_generation.py --maze-type maze --theme "medieval stone castle"

# Candy land pathfinder
python run_skin_generation.py --maze-type pathfinder --theme "candy land with lollipops"

# Industrial warehouse sokoban
python run_skin_generation.py --maze-type sokoban --theme "industrial warehouse"

# Lava dungeon trapfield
python run_skin_generation.py --maze-type trapfield --theme "lava dungeon with fire"

🏗️ Architecture

Pipeline (DAG)

Analyzer → Strategist → AssetGenerator → BackgroundRemoval

Node	Function	Output
AnalyzerNode	Load game configuration	`analysis.json`
StrategistNode	Generate asset prompts	`strategy.json`
AssetGeneratorNode	Create images (Style Anchor + I2I)	PNG files
BackgroundRemovalNode	Process & analyze images	Transparent PNGs + `description.json`

Key Mechanisms

Style Anchor: First asset (wall/road) generated via text-to-image, others use image-to-image for consistency
Parallel Generation: All non-anchor assets generated concurrently using asyncio.gather()
Vision Analysis: Each skin analyzed by multimodal LLM to generate visual descriptions
Cost Tracking: Real-time monitoring using contextvars for thread-safe cost aggregation

Project Structure

AutoEnv/
├── run_skin_generation.py          # Entry point (164 lines)
├── base/
│   ├── engine/
│   │   ├── async_llm.py           # LLM client (449 lines)
│   │   └── cost_monitor.py        # Cost tracking (116 lines)
│   ├── pipeline/
│   │   ├── base_node.py           # Node abstraction (45 lines)
│   │   └── base_pipeline.py       # DAG executor (55 lines)
│   └── utils/
│       └── image.py               # Image utilities (13 lines)
├── autoenv/
│   └── pipeline/
│       └── visual/
│           ├── nodes.py           # 4 pipeline nodes (365 lines)
│           ├── pipeline.py        # Visual pipeline (80 lines)
│           ├── prompt.py          # Prompt templates (12 lines)
│           └── maze_assets_config.py  # Asset configs (200 lines)
└── config/
    └── env_skin_gen.yaml          # Default config

Total: ~1650 lines of clean, focused code

⚙️ Configuration

Environment Variables

AutoEnv integrates with VR-Bench's unified .env configuration:

# Image Generation API (for skin generation)
IMAGE_GEN_API_KEY=your_api_key_here
IMAGE_GEN_BASE_URL=https://api.openai.com/v1
IMAGE_GEN_MODEL=gemini-2.5-flash-image

Configuration Priority

Environment Variables (highest) - from VR-Bench/.env
YAML Config (fallback) - from config/model_config.yaml

Supported Models

gemini-2.5-flash-image (recommended, fast & cheap)
dall-e-3 (high quality)
Any OpenAI-compatible image generation API

📋 Requirements

Python 3.10+
Dependencies: openai, pydantic, rembg, pillow, python-dotenv
Image generation API with vision capability
All dependencies managed in VR-Bench's root requirements.txt

🔗 Integration with VR-Bench

AutoEnv is part of the VR-Bench project:

✅ Unified Configuration: Shares .env file with VR-Bench
✅ Dependency Management: Single requirements.txt at VR-Bench root
✅ Output Format: Compatible with VR-Bench's skin format
✅ Cost Tracking: Integrated cost monitoring system
✅ Secure: Credentials not in version control

📊 Code Statistics

Total Lines: 1,648
Python Files: 16
Core Nodes: 4 (Analyzer, Strategist, Generator, BackgroundRemoval)
Architecture: Clean DAG pipeline with Pydantic validation

📄 License

MIT License