Distilling Long-tailed Datasets
February 26, 2025 ยท View on GitHub
Paper
Existing DD methods exhibit degraded performance when applied to imbalanced datasets, especially when the imbalance factor increases, whereas our method provides significantly better performance under different imbalanced scenarios.

Getting Started
- Create environment as follows
conda env create -f environment.yaml
conda activate distillation
- Generate expert trajectories
cd buffer
# representation experts
python buffer_FTD.py --cfg ../configs/buffer/CIFAR10_LT/imbrate_0005/first_stage_weight_balance.yaml
# classifier experts
python buffer_FTD.py --cfg ../configs/buffer/CIFAR10_LT/imbrate_0005/second_stage_weight_balance.yaml
- Perform the distillation
cd distill
python DAM-ED_tesla.py --cfg ../configs/xxxx.yaml
Acknowledgement
Our code is built upon MTT, FTD, and DATM.
Citation
If you find our code useful for your research, please cite our paper.
@article{zhao2024distilling,
title={Distilling Long-tailed Datasets},
author={Zhao, Zhenghao and Wang, Haoxuan and Shang, Yuzhang and Wang, Kai and Yan, Yan},
journal={arXiv preprint arXiv:2408.14506},
year={2024}
}