QFFT, Question-Free Fine-Tuning for Adaptive Reasoning

November 4, 2025 Β· View on GitHub

πŸ“ƒ Paper | πŸ€— QFFT-7B | πŸ€— QFFT-32B | πŸ“š QFFT Datasets


Paper

[2025/9/21] Our paper was accepted in NeurIPS 2025 as a Spotlight paper! We will release our revised paper and complete code soon!


⚑ Introduction

Welcome to the official repository for QFFT, Question-Free Fine-Tuning for Adaptive Reasoning!

QFFT introduces a novel and efficient fine-tuning method designed to empower large language models with adaptive reasoning ability. Instead of training models on (Question, Reasoning) pairs like traditional Supervised Fine-Tuning (SFT), QFFT discards the question input and learns solely from the reasoning responseβ€”especially Long CoT outputs.

QFFT enables models to:

  • Preserve Short CoT for simple tasks (efficiency)
  • Trigger Long CoT only when needed (effectiveness)
  • Reduce overthinking by minimizing unnecessary reasoning
  • Improve robustness in noisy, low-resource, and out-of-domain scenarios

We open-sourced our models, data, and code here.


πŸ’­ Environment

Training Environment (LLaMA-Factory)

# Create training environment
conda create --name train python=3.10
conda activate train
cd LLaMA-Factory
pip install -e ".[torch,metrics]" --no-build-isolation
pip install datasets
pip install deepspeed

Evaluation Environment (VLLM)

# Create evaluation environment
conda create --name eval python=3.10
conda activate eval
pip install vllm bitsandbytes flashinfer-python==0.2.2.post1
pip install latex2sympy2 word2number

πŸ’» Model

Model NameBase LLMLink
QFFT-S1-7BQwen2.5-7B-InstructHF Link
QFFT-S1-32BQwen2.5-32B-InstructHF Link
QFFT-LIMO-7BQwen2.5-7B-InstructHF Link
QFFT-LIMO-32BQwen2.5-32B-InstructHF Link

πŸ“š Datasets

QFFT uses distilled responses from strong Long CoT models (e.g., DeepSeek-R1). During QFFT, the input questions are removed entirely.

DatasetSizeLink
S1.11kHF Link
LIMO871HF Link

πŸ› οΈ Training

Getting Started

⚠️ Important: Before training, please modify the paths in the YAML configuration files to match your local setup.

To train a model using QFFT, follow these steps:

# Activate training environment and navigate to project directory
cd /path/to/your/Question-Free-Fine-Tuning/
cd LLaMA-Factory

conda activate train

# Train on S1 dataset
llamafactory-cli train examples/train_qfft/train_s1_qfft.yaml

# Train on LIMO dataset
llamafactory-cli train examples/train_qfft/train_limo_qfft.yaml

Our Modifications

This codebase is based on LLaMA-Factory.
Our key modification lies in the template system. We implement a new QFFT template in:

/src/llamafactory/data/template.py

For details, please refer line 1569.


πŸ§ͺ Evaluation

⚠️ Important: Before evaluation, please modify the paths in the eval.sh script to match your local setup.

To evaluate QFFT models, follow these steps:

# Navigate to evaluation directory and activate evaluation environment
cd /path/to/your/Question-Free-Fine-Tuning
cd eval

conda activate eval

# Run evaluation script
bash eval.sh

πŸ“Š Results

Here are the main results comparing SFT and QFFT on 3 mathematical reasoning benchmarks:

πŸ“Œ 7B Models (Qwen2.5-7B-Instruct)

DatasetMethodGSM8K AccGSM8K TokensMATH AccMATH TokensAIME25 AccAIME25 TokensAvg AccAvg Tokens
S1.1SFT90.61.7K80.85.3K18.217.7K63.28.2K
QFFT91.00.4K80.22.8K17.212.8K62.85.3K
Ξ”+0.4-76.5%-0.6-47.2%-1.0-27.7%-0.4-50.5%
DatasetMethodGSM8K AccGSM8K TokensMATH AccMATH TokensAIME25 AccAIME25 TokensAvg AccAvg Tokens
LIMOSFT88.21.8K80.45.8K16.817.1K61.88.2K
QFFT88.00.7K80.64.1K17.215.6K61.96.8K
Ξ”-0.2-61.1%+0.2-29.3%+0.4-8.8%+0.1-33.1%

πŸ“Œ 32B Models (Qwen2.5-32B-Instruct)

DatasetMethodGSM8K AccGSM8K TokensMATH AccMATH TokensAIME25 AccAIME25 TokensAvg AccAvg Tokens
S1.1SFT92.82.1K93.14.1K48.616.2K78.27.5K
QFFT93.60.6K92.22.4K46.812.9K77.55.3K
Ξ”+0.8-71.4%-0.9-41.5%-1.8-20.4%-0.6-44.4%
DatasetMethodGSM8K AccGSM8K TokensMATH AccMATH TokensAIME25 AccAIME25 TokensAvg AccAvg Tokens
LIMOSFT91.21.9K93.03.9K45.813.2K76.66.3K
QFFT92.60.8K92.62.9K45.012.5K76.75.4K
Ξ”+1.4-57.9%-0.4-25.6%-0.8-5.3%+0.1-29.6%

πŸ“– Citation

@misc{liu2025qfft,
  title={QFFT, Question-Free Fine-Tuning for Adaptive Reasoning},
  author={Wanlong Liu and Junxiao Xu and Fei Yu and Yukang Lin and Ke Ji and Wenyu Chen and Yan Xu and Yasheng Wang and Lifeng Shang and Benyou Wang},
  year={2025},
  eprint={2506.12860},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2506.12860},
}