Value Residual Learning
March 20, 2025 ยท View on GitHub
This official repo includes instructions for running Resformer and SVformer introduced in the following paper Value Residual Learning.
Requirement
pip install transformers=4.44.2.
Data
- Download the tokenizer and place it in the "data/tokenizer/RedPajama-INCITE-Base-7B".
- Follow the instructions in the "README.md" located in "src_data/" to prepare "processed_slimpajama_20B" and place it in the "data/".
Analysis
The code for entropy analysis and token similarity analysis can be found in "analyze/get_entropy.py" and "analyze/get_simlarity.py" respectively.
Train
mkdir logs, mkdir output
Modify the "CACHE" and "CODE_DIR" in the "*.sh" file, then run bash scripts/run_llama_baseline_82M.sh and bash scripts/run_llama_resformer_82M.sh.
Relative Loss Analysis
Run analyze/plot_relative_loss.py.