Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models
July 18, 2025 ยท View on GitHub
This repository contains the core implementation of our ICML 2025 paper:
"Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models."
๐ง Overview
Our work introduces a novel method to predict Chain-of-Thought (CoT) reasoning gains using token-level decoding features from large language models (LLMs). This repository includes all code for inference, answer extraction, and evaluation used in the paper.
๐ File Structure
๐ Core Inference
-
main.py,solve.py,task1.py:
Main scripts to run inference using LLMs. -
extract_answer.py:
Extracts answers from model outputs viavllmand character-level matching.
๐ Evaluation Scripts
cal_aggregated_sc.py: Compute aggregated score.cal_instance_sc.py: Compute per-instance score.cal_token_use.py: Calculate token consumption.cal_cot_gain.py: Compute Chain-of-Thought (CoT) gain.
๐ Execution Scripts
run_main_program.sh: Run full inference pipeline.run_extract.sh: Extract answers from model output.run_cal.sh: Run evaluation scripts to compute scores and CoT gain.
๐ Directory Overview
-
benchmark/:
Contains question-answer pairs for various benchmarks. -
dynamic_cot/:
Key implementation of dynamic Chain-of-Thought prompting. -
model transfer/:
Core code for model transfer experiments.
๐ Citation
If you find this code useful for your research, please consider citing our paper:
@article{liu2025token,
title={Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models},
author={Liu, Peijie and Xu, Fengli and Li, Yong},
journal={arXiv preprint arXiv:2506.06008},
year={2025}
}