Hyperbolic Dataset Distillation (NeurIPS 2025)
March 20, 2026 · View on GitHub
The first dataset distillation framework that leverages hyperbolic geometry to preserve hierarchical data structures. The repository is based on geoopt , DM and IDM. If you use this code, please consider citing them.
- Dec 2025: Code was released.
- Oct 2025: Project page was released.
- Sept 2025: Our paper has been accepted to NeurIPS 2025!
- May 2025: Preprint was released.
🎯 Key Contributions
-
Hyperbolic Geometry for Dataset Distillation Introduces the first dataset distillation framework in hyperbolic space, enabling natural modeling of hierarchical data structures.
-
Centroid Alignment via Geodesic Distance Minimizes the Lorentzian hyperbolic distance between Fréchet means of real and synthetic datasets, ensuring semantic and geometric consistency.
-
Hierarchical Sample Weighting Derives analytic weighting that prioritizes prototype-like (root-level) samples while attenuating noisy, peripheral ones.
-
Efficient Hierarchical Pruning Demonstrates that only 20% of the original data is sufficient to maintain comparable accuracy, highlighting hyperbolic redundancy reduction.
-
Broad Compatibility & Superior Results Integrates smoothly with DM and IDM, yielding consistent gains in classification accuracy and training stability across diverse datasets.
Usage
DM with HDD
cd DM-HDD
python main_DM.py --dataset CIFAR10 --model ConvNet --ipc 10 --dsa_strategy color_crop_cutout_flip_scale_rotate --init real --lr_img 1 --num_exp 5 --num_eval 5
IDM with HDD
cd IDM-HDD
python3 -u IDM_cifar10.py --dataset CIFAR10 --model ConvNet --ipc 10 --dsa_strategy color_crop_cutout_flip_scale_rotate --init real --lr_img 1 --num_exp 5 --num_eval 5 --net_train_real --eval_interval 100 --outer_loop 1 --mismatch_lambda 0 --net_decay --embed_last 1000 --syn_ce --ce_weight 0.5 --train_net_num 1 --aug
Citing HDD
If you find this project useful for your research, please use the following BibTeX entry.
@inproceedings{li2025hdd,
title={Hyperbolic Dataset Distillation},
author={Li, Wenyuan and Li, Guang and Maeda, Keisuke and Ogawa, Takahiro and Haseyama, Miki},
booktitle={Proceedings of the Advances in Neural Information Processing Systems (NeurIPS)},
year={2025}
}