Make permanent
December 9, 2025 ยท View on GitHub
The First Framework for LLM-Driven Machine Design in Besiege
Paper: Agentic Design of Compositional Machines
Installation โข Quick Start โข Training โข Citation
๐ Table of Contents
- Overview
- Installation
- Quick Start
- Fine-tuning
- Performance Leaderboard
- RL Fine-tuning Results
- License
- Acknowledgement
- Citation
๐ Overview
BesiegeField is a cutting-edge framework that enables Large Language Models (LLMs) to autonomously design and build complex machines in the Besiege physics-based game environment. This project bridges AI reasoning with creative engineering tasks.
๐ Installation
1. Besiege Environment Setup
๐ฆ System Requirements
| Component | Version |
|---|---|
| Besiege | Linux v1.60-22044 |
| Ubuntu | 22.04 |
| GLIBC | 2.33 โ 2.35 |
| Mono | โฅ 6.8.0.105 |
๐ฏ Obtain the Game
Step 1: Purchase the official copy on Steam
Step 2: Download DepotDownloader
Step 3: Download Besiege v1.60-22044
./DepotDownloader -app 346010 -depot 346016 -manifest 2732248020700221971 \
-username <steam_user> -password <password>
Step 4: Download v1.20-17395 executables (required for headless operation)
./DepotDownloader -app 346010 -depot 346016 -manifest 5506301120812842666 \
-username <steam_user> -password <password>
๐ก Tip: Find other manifests on SteamDB if needed.
๐ Download the Plugin
๐ฅ BesiegeField Plugin (Google Drive)
๐ ๏ธ Install Dependencies
Standard Installation:
sudo apt install mono-complete xvfb # xvfb only for headless workstation
mono --version # Verify โฅ 6.8.0.105
๐ฆ Offline/Manual Installation (click to expand)
If apt is unavailable, use manual installation:
# Install mono
cd /path/to/tar
tar -xzf mono-complete-offline.tar.gz
for deb in *.deb; do dpkg -x "$deb" .; done
export PATH="/path/to/mono/usr/bin:$PATH"
export LD_LIBRARY_PATH="/path/to/mono/usr/lib:$LD_LIBRARY_PATH"
export PKG_CONFIG_PATH="/path/to/mono/usr/lib/pkgconfig:$PKG_CONFIG_PATH"
# Make permanent
cat >> ~/.bashrc <<EOF
export PATH="/path/to/mono/usr/bin:\$PATH"
export LD_LIBRARY_PATH="/path/to/mono/usr/lib:\$LD_LIBRARY_PATH"
export PKG_CONFIG_PATH="/path/to/mono/usr/lib/pkgconfig:\$PKG_CONFIG_PATH"
EOF
source ~/.bashrc
# Install xvfb
cd /path/to/xvfb
tar -xzf xvfb-offline.tar.gz
dpkg -i *.deb
โ๏ธ Install BesiegeField Plugin
Step 1: Extract the plugin archive and copy all files into the v1.60-22044 game folder
Step 2: Copy Besiege.x86 & Besiege.x86_64 from v1.20-17395 into v1.60-22044, overwriting the originals
โ ๏ธ Warning: This enables headless/code control but makes normal GUI start unstable. Keep a backup if you want to launch v1.60 visually.
Step 3: Set permissions
chmod -R 777 /path/to/Besiege
Step 4: Test the vanilla game (use backup copy)
cd /path/to/backup/Besiege && ./run.sh
2. AgenticFlow Installation
๐ Create Conda Environment
conda env create -f environment_inferenceonly.yaml
conda activate <env_name>
๐ Path Configuration
Folder Structure:
your-project/
โโโ Besiege/ # Game installation
โโโ AgenticCodes/ # Framework code
Edit AgenticCodes/config.py:
| Parameter | Description |
|---|---|
APIPATH | Path to file storing LLM type, API key, etc. Fill it in yourself. |
DEFAULT_SAVE_ROOT | Root directory for LLM outputs |
SCRIPT_PATH | Must point to Besiege/run_besiegefield.sh |
๐ฏ Quick Start
๐น Catapult Task
Design a machine to throw projectiles:
python main.py \
-use_model deepseek-chat \
-task catapult/catapult_level1 \
-env_num 2 \
-user_input "Design a machine to throw a boulder (type id 36) in a parabolic trajectory."
๐ Car Task
Design a machine to move forward:
python main.py \
-use_model deepseek-chat \
-task car/car_level1 \
-env_num 2 \
-user_input "Design a machine to move forward on a straight road."
๐ Available Tasks
Explore all available tasks in environments/env_files/level_menus.json
๐ฎ Testing Your Designs
- Generated
.bsgmachine files appear inDEFAULT_SAVE_ROOT - Copy them to
Besiege/Besiege_Data/SavedMachines - Run
./run.shto launch the game - Inspect and test your AI-designed machines in-game!
๐ง LLM Fine-tuning
๐ฆ Install Training Environment
Add training-related packages:
conda activate <env_name>
pip install -r requirements_rl.txt
โ๏ธ Cold Start Training
Step 1: Run Cold Start with Orthogonal Finetuning (Dataset will download from huggingface)
cd PostTraining/ColdStart
./run_cold_start.sh <model_path>
If you want to try cold start with human dataset (Not Recommended), you can run with:
cd PostTraining/ColdStart
./run_cold_start.sh <model_path> true
Step 2: Merge Checkpoints
Fill the paths in merge_ckpts.py before running:
python merge_ckpts.py
๐ Reinforcement Learning
Configure rl_config.yaml with your settings (important!), then run:
cd PostTraining/RL
./rl_single_agent_light.sh
๐ Performance Leaderboard
๐ฏ Catapult Task
Performance metrics across different models and methods:
| Models | Single-agent | Iterative Editing | Hierarchical Design | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Mean | Max | Std | Mean | Max | Std | Mean | Max | Std | |
| Gemini 2.5 Pro | 2.30 | 9.00 | 3.86 | 4.67 | 21.95 | 8.68 | 9.83 | 18.19 | 8.35 |
| OpenAI o3 | 2.87 | 5.22 | 1.96 | 9.14 | 14.01 | 3.71 | 2.00 | 11.11 | 3.98 |
| Qwen3-Coder-480B-A35B | 1.75 | 9.24 | 3.17 | 5.10 | 12.02 | 5.54 | 3.90 | 6.52 | 2.54 |
| Doubao Seed 1.6-250615 | 3.18 | 8.20 | 2.99 | 4.82 | 9.10 | 3.41 | 1.73 | 4.76 | 2.39 |
| Claude Opus 4-20250514 | 1.19 | 4.82 | 2.21 | 1.18 | 4.91 | 2.18 | 2.27 | 9.32 | 4.22 |
| DeepSeek-V3 | 3.50 | 4.86 | 2.17 | 3.07 | 5.24 | 2.55 | 2.41 | 4.93 | 2.58 |
| Kimi K2-0711-preview | 2.57 | 9.05 | 3.72 | 2.82 | 11.39 | 5.23 | 5.39 | 12.02 | 5.16 |
| Llama 4 Scout 17B 16E | 3.18 | 5.64 | 1.95 | 1.28 | 5.94 | 2.41 | 3.59 | 11.83 | 4.15 |
๐ Car Task
Performance metrics across different models and methods:
| Models | Single-agent | Iterative Editing | Hierarchical Design | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Mean | Max | Std | Mean | Max | Std | Mean | Max | Std | |
| Gemini 2.5 Pro | 33.96 | 40.85 | 6.73 | 34.34 | 41.66 | 13.96 | 29.96 | 41.52 | 7.78 |
| OpenAI o3 | 15.28 | 32.08 | 8.97 | 14.34 | 35.08 | 11.79 | 28.39 | 36.18 | 11.01 |
| Qwen3-Coder-480B-A35B | 8.87 | 11.50 | 4.46 | 15.24 | 28.95 | 13.12 | 12.59 | 34.05 | 10.78 |
| Doubao Seed 1.6-250615 | 3.51 | 9.40 | 4.85 | 8.11 | 10.04 | 3.58 | 18.75 | 26.02 | 4.38 |
| Claude Opus 4-20250514 | 9.83 | 12.98 | 1.28 | 8.07 | 28.04 | 12.48 | 14.56 | 38.67 | 20.69 |
| DeepSeek-V3 | 9.06 | 10.53 | 3.68 | 8.23 | 18.84 | 7.12 | 17.92 | 31.94 | 12.85 |
| Kimi K2-0711-preview | 1.75 | 8.09 | 2.80 | 14.36 | 28.34 | 9.47 | 1.94 | 14.99 | 5.48 |
| Llama 4 Scout 17B 16E | 0.02 | 0.03 | 0.01 | 3.04 | 12.76 | 5.23 | 1.55 | 2.00 | 0.32 |
๐ RL-Finetuned LLM Results
Performance comparison of Qwen2.5-14B-Instruct model with different training strategies:
| Models | Catapult | Car | ||||
|---|---|---|---|---|---|---|
| Validity Ratio | Mean Score | Max Score | Validity Ratio | Mean Score | Max Score | |
| Qwen2.5-14B-Instruct | 11/50 | 0.06 | 2.41 | 46/50 | 4.97 | 19.10 |
| Qwen2.5-14B-Instruct + Cold-Start | 9/50 | 0.11 | 5.54 | 40/50 | 4.67 | 20.23 |
| Qwen2.5-14B-Instruct + RL | 12/50 | 0.13 | 5.92 | 41/50 | 3.72 | 24.08 |
| Qwen2.5-14B-Instruct + Cold-Start + RL | 11/50 | 0.14 | 7.14 | 42/50 | 5.05 | 45.72 |
๐ Citation
If you find this repository useful for your research or projects, please consider citing our work:
@article{zhang2025besiegefield,
title={Agentic Design of Compositional Machines},
author={Zhang, Wenqian and Liu, Weiyang and Liu, Zhen},
journal={arXiv preprint arXiv:2510.14980},
year={2025}
}
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgement
Weโd like to thank the developers of Besiege for creating such an inspiring game and for nurturing such a vibrant player community โ without them, this project wouldnโt exist.
Big thanks also to the BepInEx team for their amazing modding framework, which made it possible for us to push the boundaries of whatโs possible in Besiege.
โญ Star History
If you find this project helpful, please consider giving it a star! โญ
๐ License
This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License - see the LICENSE file for details.