Uni-Agent: Build, Run, and Train Agents at Scale

June 15, 2026 Β· View on GitHub

Docs License

Uni-Agent is a unified framework for general agents at scale.

  • All-in-one stack: one framework for building, running, and training agents.
  • Unified agent interface: unified abstractions for diverse and complex real-world agent scenarios.

The long-term vision is to build the backend infrastructure across both inference and training, enabling agents to perceive, act, and explore complex real-world tasks.

Highlights ✨

Unified yet decoupled agent stack: Uni-Agent organizes agents around model, tool, and env, so each layer can be swapped independently while still composing into one unified interaction framework.

Large-scale parallel interaction: Uni-Agent supports high-throughput, stable parallel inference, execution, and verification for 1000+ concurrent agent tasks.

One stack from inference to training: Uni-Agent reuses the same interaction stack from large-scale agent execution to RL training, with support for advanced paradigms such as fully-async and partial rollout.

Quickstart πŸš€

Start with the docs below:

Architecture 🧩

Uni-Agent architecture overview

Uni-Agent is built around a unified interaction loop with three parts: model, tool, and env.

  • model is the reasoning backend that decides what to do next,
  • tool is how the model perceives and acts on the env
  • env is the runtime environment where actions are executed and state is preserved.

This interaction stack is used for large-scale agent execution and can be connected to verl for scalable RL training.

Installation πŸ“¦

Uni-Agent builds on top of latest verl release and can use it as a normal Python package.

git submodule update --init --recursive
pip install --no-deps -e ./verl

# Other Dependencies
pip install swe-rex loguru pydantic pydantic_settings aiohttp

See the full installation guide in the docs: Installation.

Live Dashboard πŸ‘€

Uni-Agent Dashboard overview

Uni-Agent includes a lightweight dashboard for monitoring large parallel runs in real time. It is designed for workloads such as parallel inference and reinforcement learning.

Start the dashboard from the repository root:

python -m dashboard.server --log-dir /tmp/swebench_qwen3_coder --port 8765

See dashboard/README.md for more details.

Results πŸ“Š

Parallel Inference & Verification

We compare Uni-Agent with existing agent systems on parallel inference and verification workloads.

ModelBenchmarkOpenHandsUni-AgentSetting
Qwen3-Coder-30BSWE-Bench Verified-49.2Avg@4, 100 turns, 128K
Qwen3-Coder-480BSWE-Bench Verified62.464.2Avg@4, 500 turns, 256K
Qwen3-Coder-NextSWE-Bench Verified66.667.6Avg@4, 300 turns, 128K
Qwen3.5-35B-A3BSWE-Bench Verified62.068.4Avg@1, 200 turns, 128K
Qwen3.6-35B-A3BTerminal-Bench v2-42.5Avg@1, 200K

Agent Reinforcement Learning

Uni-Agent supports agent RL training with the same interaction stack used at inference time. We provide fully async training recipes across multiple tasks, models and datasets, with GRPO/GSPO-style objectives and partial rollout support. Example scripts are available in examples/agent_train.

ModelDatasetMethodSettingBaseRL
Qwen3-30B-A3B-InstructR2E-GymGSPOFully Async, 100 turns, 128K22.236.8
Qwen3-Coder-30B-A3B-InstructR2E-GymGSPOFully Async, 100 turns, 128K46.252.0
Qwen3.5-9BSWE-reBenchGRPOFully Async, 100 turns, 128K53.859.2

More training dynamics, including reward, validation score, and average-turn curves, are available in the agent training guide.

Roadmap πŸ—ΊοΈ

The roadmap below highlights the next major directions for Uni-Agent.

Environment Support

  • Local deployment support.
  • Modal deployment support.
  • More cloud deployment backends (e.g., Yuanrong Sandbox Management System).

Tool and Task Support

  • GUI tool support.
  • Integration of Skills.
  • More built-in tools and task patterns.

Model Support

  • DeepSeek model support.
  • Multimodal model support.

Agent Integration

  • Black-box integration of additional third-party agents (Ref: RFC #5790).

Performance Optimization

  • Optimize Agentic RL rollout performance (Ref: Issue #6383).

Acknowledgement πŸ™

Uni-Agent's large-scale parallel interaction and verification rely on remote sandbox backends. We gratefully acknowledge:

  • veFaaS: Volcengine Function-as-a-Service, used as a serverless backend for elastically launching agent sandboxes at scale.
  • Modal: serverless cloud compute used to spin up isolated, reproducible sandbox environments for agent execution and evaluation.

Citation πŸ“š

If you find the project helpful, please cite:

@misc{uniagent_github,
  author       = {Yuyang Ding and Bo Wen and Xubo Cao and Zhiqiang Zhai and Guangming Sheng and Xibin Wu and Juntao Li and Min Zhang and Uni-Agent Contributors},
  title        = {Uni-Agent: Build, Run, and Train Agents at Scale},
  year         = {2026},
  howpublished = {\url{https://github.com/verl-project/uni-agent}},
  note         = {GitHub repository. Supervisor: Xibin Wu and Juntao Li},
  urldate      = {2026-03-27}
}

Contributing 🀝

Community contributions are welcome. See CONTRIBUTING.md for guidelines on how to get started.