Code for ``DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization''
November 11, 2025 ยท View on GitHub
1. Requirements:
-
python 3.10, pytorch >= 2.0
-
install pytorch with cuda from https://pytorch.org/get-started/locally/, it is prerequisite for fast-hadamard-transform package.
-
pip install -r requirement.txt
install fast-hadamard-transform
cd third-part git clone https://github.com/Dao-AILab/fast-hadamard-transform.git cd fast-hadamard-transform pip install .install lm-eval
git clone https://github.com/EleutherAI/lm-evaluation-harness.git cd lm-evaluation-harness pip install -e .
Guidelines
-
The ./fake_quant folder contains the code for fusing the calibrated rotation matrix and performing the quantization test. The usage is described in detail in the Readme.md file in the directory.
-
The ./calibrater folder contains the code for obtaining the calibration set and the calibration rotation matrix. The specific usage is described in the Readme.md in this directory.
NPU
- ./NPU_DartQuant folder contains contains NPU runtime code, and its usage is basically the same as that of the GPU version.