README.md

March 10, 2026 · View on GitHub

ARM: Adaptive Reasoning Model

ARM—Adaptive Reasoning Model, a reasoning model capable of adaptively selecting appropriate reasoning formats based on the task at hand.

ARM

Updates

2026/03/10: We further propose CODA: A difficulty-aware compute allocation method for adaptive reasoning, enabling models to spend fewer tokens on easy problems and more on hard ones.
2025/05/27: Thrilled to release ARM: A reasoning model capable of adaptively selecting reasoning formats based on the task, achieving a better trade-off between effectiveness and efficiency!

Data & Model

You can download our dataset and model from 🤗HuggingFace.

Environments

This repository contains the codebase for SFT and RL based on LLaMA-Factory and VeRL. We use two separate conda environments for each stage:

# SFT
conda env create -f environment/llama_factory_env.yaml
conda activate arm_llama_factory

# RL
conda env create -f environment/verl_env.yaml
conda activate arm_verl
pip3 install --force-reinstall torch==2.4.0 --index-url https://download.pytorch.org/whl/cu124
pip3 install flash-attn --no-build-isolation

Stage1: SFT

conda activate arm_llama_factory
cd LLaMA-Factory

Make sure to specify the correct model path in the .yaml file.

Train

CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train stage1_scripts/qwen2.5_7b/train.yaml

Merge

llamafactory-cli export stage1_scripts/qwen2.5_7b/merge.yaml

Stage2: RL

conda activate arm_verl
cd verl

Make sure to specify the correct model and data path in the .sh file.

Data Process

# The training data is located in arm/verl/data/parquet.  
# Alternatively, you can prepare your own training data, e.g.:
python3 stage2_scripts/data_preprocess/gsm8k.py

# You can also prepare data for the instruction-guided mode used in evaluation, e.g.:
python3 stage2_scripts/data_preprocess/instruction_guided/gsm8k.py

Train

bash stage2_scripts/trainer/run.sh

Generate

# Adaptive Mode
bash stage2_scripts/generation/adaptive_run.sh

# Instruction-Guided Mode. Specify the reasoning format in the .sh file:
bash stage2_scripts/generation/instruction_guided_run.sh

Evaluate

bash stage2_scripts/evaluation/run.sh

🔍Roadmap

[Work in Progress] Stay tuned!

Contact

If you have any problems, please contact Siye Wu and Jian Xie.

Citation Information

If our paper or related resources prove valuable to your research, we kindly ask for a citation.

@inproceedings{
wu2025arm,
title={{ARM}: Adaptive Reasoning Model},
author={Siye Wu and Jian Xie and Yikai Zhang and Aili Chen and Kai Zhang and Yu Su and Yanghua Xiao},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://openreview.net/forum?id=z9oeQrcNh9}
}