verltool_v0.6.0_upgrade.md
November 16, 2025 ยท View on GitHub
Upgrade Notes for v0.6.0
To support the latest open-source models and verl features, VerlTool has re-organized its codebase to accommodate the updates in verl 0.6.0 and vllm 0.11.0. Below are the key changes and instructions for upgrading your existing VerlTool setup to be compatible with these new versions.
verl-tool's codebase has been completely re-organized. Thanks to the verl's agent loop abstraction design, we are able to put all theverl-tool's agentic logic in a single fileverl_tool/agent_loop/verltool_agent_loop.py, with the main agent loop logic less than 200 lines of code. This greatly improves the modularity and maintainability of the codebase. Please refer to the new code structure when making any custom modifications.verl-toolkeeps its support for both text-only LLMs and multi-modal models training, withmath_tirandpixel_reasoneras examples correspondingly.- We strictly force the "tokens-in" and "tokes-out" design to avoid potential off-policy issues brought by tokenization.
- We put all the reference
verl-tool's custom replacement of classes and functions inverl_tool/trainer/ppo/ray_trainer.pyfor better maintainability. If you are trying to understand howverl-toolreplaces verl's default implementations, please refer to this file. - The step records are saved via verl's native
trainer.rollout_data_dirargument. (e.g.trainer.rollout_data_dir=$(pwd)/verl_step_records/$run_name). You need to set it in your training scripts to save the rollout data. - The verl-tool now supports hybrid training with tool and without tool. When preparing the data, simply set the
use_toolfield in the data samples to indicate whether the sample requires tool usage. The agent loop will automatically decide whether to call the tool server based on this field. - The old
verl_toolwith verl0.4.1.devis archived in theverl-0.4.1branch for backward compatibility.