README.md

June 7, 2026 · View on GitHub

CLARO: Controlled Attribute-Driven Reasoning Optimization for Efficient Chain-of-Thought

How to Use

Installation

git clone git@github.com:odedsc/CLARO.git
cd CLARO
pyenv virtualenv 3.11.7 CLARO
pyenv activate CLARO
pip install -r requirements.txt
pip install flash-attn==2.7.4.post1

Replicate Results

To replicate results for CLARO , you can use scripts in scripts/replicate.

Prepare data:

./scripts/replicate/prepare_data.sh

Evaluate models:

./scripts/replicate/eval_model.sh odedsc/CLARO-1.5B
./scripts/replicate/eval_model.sh odedsc/CLARO-7B

Train Models

You can skip this step if you want to use our pre-trained models.

You can run scripts in scripts/train to train your own models. Make sure to specify the correct data path.

Evaluate Models

Use one of scripts/eval to evaluate your models. Make sure to specify the correct model path.

For example, evaluate CLARO on the AIME2025 dataset:

./scripts/eval/eval_model.sh --model path/to/your/model --num-tokens <num_tokens> --datasets aime2025

Prepare Your Own Dataset

You can use scripts in scripts/data to prepare your own dataset.

For CLARO:

python scripts/data/deepscaler_dataset.py --use_both_both

For Evaluation on AIME2025, GPQA, LSAT and MMLU, you can use scripts in scripts/eval:

python scripts/data/generate_aime.py
python scripts/data/generate_gpqa.py
python scripts/data/generate_lsat.py
python scripts/data/generate_mmlu.py

Models

We release the CLARO-optimized models via Hugging Face.

Model	Size	Link
CLARO-1.5B	1.5B	🤗 Hugging Face
CLARO-7B	7B	🤗 Hugging Face

Acknowledgments

We would like to thank rLLM and L3 Lab for codebase, and opensourcing their models. This codebase is built on top of their work.