QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models
September 17, 2025 ยท View on GitHub
This is the official implementation of QWHA (Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning).
Overview
QWHA is a parameter-efficient fine-tuning method designed for quantized large language models. It leverages Walsh-Hadamard transformations to adapt quantized models efficiently while acheiving high fine-tuned accuracy.
Installation
First, install uv if not already installed:
curl -LsSf https://astral.sh/uv/install.sh | sh
Install dependencies:
uv sync
source .venv/bin/activate
uv pip install gptqmodel==2.2.0 --no-build-isolation
cd peft
uv pip install -e .
Initialization
Before proceeding with the initialization steps, set the QWHA_CACHE_PATH environment variable. This directory will store quantized models and QWHA-initialized checkpoints:
export QWHA_CACHE_PATH=/path/to/cache
Run MagR Quantization
First, run MagR quantization. Below is an example for meta-llama/Llama-3.2-3B, quantized to 2-bit per-group quantization with group size 64. A shell script is also provided in MagR/quant.sh:
MODEL_ID=meta-llama/Llama-3.2-3B
BITS=2
GROUPS=64
python MagR/llama.py $MODEL_ID --wbits $BITS --groupsize $GROUPS --magr --static-groups --save "${QWHA_CACHE_PATH}/gptq_models/$MODEL_ID-${BITS}bits-g${GROUPS}"
Run QWHA Initialization
Next, run the QWHA initialization code. A shell script is also provided in src/init/init.sh:
MODEL_ID=meta-llama/Llama-3.2-3B
BITS=2
GROUPS=64
RANK=64
python src/init/initialize.py -m $MODEL_ID -q gptq -b $BITS -g $GROUPS -r $RANK --eval_ppl
Fine-Tuning
Use the shell scripts in src/sft to fine-tune QWHA models on each datasets:
cd src/sft/
./train_gsm8k.sh # Train on GSM8K dataset
./train_alpaca.sh # Train on Alpaca dataset
Citation
If you use this code in your research, please cite our paper:
@article{qwha2025,
title={QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models},
author={Hyesung Jeon and Seojune Lee and BeomSeok Kang and Yulhwa Kim and Jae-Joon Kim},
journal={arXiv preprint},
year={2025}
}
License
This project is licensed under the terms specified in the LICENSE file.