README.md
March 10, 2026 · View on GitHub
ARM: Adaptive Reasoning Model
ARM—Adaptive Reasoning Model, a reasoning model capable of adaptively selecting appropriate reasoning formats based on the task at hand.
Updates
- 2026/03/10: We further propose CODA: A difficulty-aware compute allocation method for adaptive reasoning, enabling models to spend fewer tokens on easy problems and more on hard ones.
- 2025/05/27: Thrilled to release ARM: A reasoning model capable of adaptively selecting reasoning formats based on the task, achieving a better trade-off between effectiveness and efficiency!
Data & Model
You can download our dataset and model from 🤗HuggingFace.
Environments
This repository contains the codebase for SFT and RL based on LLaMA-Factory and VeRL. We use two separate conda environments for each stage:
# SFT
conda env create -f environment/llama_factory_env.yaml
conda activate arm_llama_factory
# RL
conda env create -f environment/verl_env.yaml
conda activate arm_verl
pip3 install --force-reinstall torch==2.4.0 --index-url https://download.pytorch.org/whl/cu124
pip3 install flash-attn --no-build-isolation
Stage1: SFT
conda activate arm_llama_factory
cd LLaMA-Factory
Make sure to specify the correct model path in the .yaml file.
Train
CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train stage1_scripts/qwen2.5_7b/train.yaml
Merge
llamafactory-cli export stage1_scripts/qwen2.5_7b/merge.yaml
Stage2: RL
conda activate arm_verl
cd verl
Make sure to specify the correct model and data path in the .sh file.
Data Process
# The training data is located in arm/verl/data/parquet.
# Alternatively, you can prepare your own training data, e.g.:
python3 stage2_scripts/data_preprocess/gsm8k.py
# You can also prepare data for the instruction-guided mode used in evaluation, e.g.:
python3 stage2_scripts/data_preprocess/instruction_guided/gsm8k.py
Train
bash stage2_scripts/trainer/run.sh
Generate
# Adaptive Mode
bash stage2_scripts/generation/adaptive_run.sh
# Instruction-Guided Mode. Specify the reasoning format in the .sh file:
bash stage2_scripts/generation/instruction_guided_run.sh
Evaluate
bash stage2_scripts/evaluation/run.sh
🔍Roadmap
[Work in Progress] Stay tuned!
Contact
If you have any problems, please contact Siye Wu and Jian Xie.
Citation Information
If our paper or related resources prove valuable to your research, we kindly ask for a citation.
@inproceedings{
wu2025arm,
title={{ARM}: Adaptive Reasoning Model},
author={Siye Wu and Jian Xie and Yikai Zhang and Aili Chen and Kai Zhang and Yu Su and Yanghua Xiao},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://openreview.net/forum?id=z9oeQrcNh9}
}