QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models

September 17, 2025 ยท View on GitHub

This is the official implementation of QWHA (Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning).

Overview

QWHA is a parameter-efficient fine-tuning method designed for quantized large language models. It leverages Walsh-Hadamard transformations to adapt quantized models efficiently while acheiving high fine-tuned accuracy.

Installation

First, install uv if not already installed:

curl -LsSf https://astral.sh/uv/install.sh | sh

Install dependencies:

uv sync
source .venv/bin/activate
uv pip install gptqmodel==2.2.0 --no-build-isolation
cd peft
uv pip install -e .

Initialization

Before proceeding with the initialization steps, set the QWHA_CACHE_PATH environment variable. This directory will store quantized models and QWHA-initialized checkpoints:

export QWHA_CACHE_PATH=/path/to/cache

Run MagR Quantization

First, run MagR quantization. Below is an example for meta-llama/Llama-3.2-3B, quantized to 2-bit per-group quantization with group size 64. A shell script is also provided in MagR/quant.sh:

MODEL_ID=meta-llama/Llama-3.2-3B
BITS=2
GROUPS=64
python MagR/llama.py $MODEL_ID --wbits $BITS --groupsize $GROUPS --magr --static-groups --save "${QWHA_CACHE_PATH}/gptq_models/$MODEL_ID-${BITS}bits-g${GROUPS}"

Run QWHA Initialization

Next, run the QWHA initialization code. A shell script is also provided in src/init/init.sh:

MODEL_ID=meta-llama/Llama-3.2-3B
BITS=2
GROUPS=64
RANK=64
python src/init/initialize.py -m $MODEL_ID -q gptq -b $BITS -g $GROUPS -r $RANK --eval_ppl

Fine-Tuning

Use the shell scripts in src/sft to fine-tune QWHA models on each datasets:

cd src/sft/
./train_gsm8k.sh    # Train on GSM8K dataset
./train_alpaca.sh   # Train on Alpaca dataset

Citation

If you use this code in your research, please cite our paper:

@article{qwha2025,
  title={QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models},
  author={Hyesung Jeon and Seojune Lee and BeomSeok Kang and Yulhwa Kim and Jae-Joon Kim},
  journal={arXiv preprint},
  year={2025}
}

License

This project is licensed under the terms specified in the LICENSE file.