SWE-MiniSandbox

March 4, 2026 · View on GitHub

SWE-MiniSandbox


Docs arXiv 2602.11210

SWE-MiniSandbox

SWE-MiniSandbox is a lightweight, container-free framework for isolating batched SWE agent interactions and computing execution-based rewards. It also supports multi-node, on-policy reinforcement learning training without relying on Kubernetes or other container orchestration systems.

The framework relies on Linux namespaces, chroot, and bind mounts to create secure and efficient environments where SWE agents can interact with the system while remaining isolated from one another.

Python environments are isolated using venv, ensuring that each agent has its own dependencies without interference. These venvs are managed using different versions of Conda, providing flexibility in the software stack available to each agent.

Kawhi the SWE-smith

SWE‑MiniSandbox achieves training performance equivalent to traditional Docker‑based setups. We provide a WandB training demo comparing both frameworks:

👉 https://wandb.ai/open_source_blank/SWE-MiniSandbox

Using SWE‑bench / SWE‑agent‑LM‑7B, we trained on 1600 SWE‑Smith samples for 200 steps under both the minisandbox and Docker implementations, observing comparable performance.

Additional experiments and detailed methodology can be found in our paper:

👉 https://arxiv.org/abs/2602.11210


🚀 Get Started

Read our documentation for detailed instructions on installing and using SWE-MiniSandbox. Before you begin, ensure your machine meets the system requirements below.

System Requirements

  • Linux OS (Ubuntu 20.04+ recommended)
  • Namespace isolation support (the unshare --mount command must be available)
  • Bind mount support (the mount --bind command must be available)

Installation

We provide two installation options:

  • Docker image for quick setup
  • Manual installation for full control

See the Installation Guide for step-by-step instructions.

Quick Start

To get up and running quickly, follow the steps in our Quick Start Guide.


Additional Resources


Acknowledgements

SWE-MiniSandbox is built upon:

Some components are adapted from R2E-Gym.


Citation

@misc{yuan2026sweminisandboxcontainerfreereinforcementlearning,
      title={SWE-MiniSandbox: Container-Free Reinforcement Learning for Building Software Engineering Agents}, 
      author={Danlong Yuan and Wei Wu and Zhengren Wang and Xueliang Zhao and Huishuai Zhang and Dongyan Zhao},
      year={2026},
      eprint={2602.11210},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2602.11210}, 
}