README.md
July 1, 2025 · View on GitHub
Zheng Zhou1 Wenquan Feng1 Qiaosheng Zhang2 3 Shuchang Lyu1 Qi Zhao1 Guangliang Cheng4
1 Beihang University
2 Shanghai Artificial Intelligence Laboratory
3 Shanghai Innovation Institute
4 University of Liverpool
International Conference on Machine Learning (ICML), 2025
This repository contains the code and implementation details for the research paper titled "ROME is Forged in Adversity: Robust Distilled Datasets via Information Bottleneck".
🎯 Overview of ROME
Abstract: Dataset Distillation (DD) compresses large datasets into smaller, synthetic subsets, enabling models trained on them to achieve performance comparable to those trained on the full data. However, these models remain vulnerable to adversarial attacks, limiting their use in safety-critical applications. While adversarial robustness has been widely studied in related fields, research on improving DD robustness is still limited. To address this, we propose ROME, a novel method that enhances the adversarial RObustness of DD by leveraging the InforMation BottlenEck (IB) principle. ROME includes two components: a performance-aligned term to preserve accuracy and a robustness-aligned term to improve robustness by aligning feature distributions between synthetic and perturbed images. Furthermore, we introduce the Improved Robustness Ratio (I-RR), a refined metric to better evaluate DD robustness. Extensive experiments on CIFAR-10 and CIFAR-100 datasets demonstrate that ROME outperforms existing DD methods in adversarial robustness, achieving maximum I-RR improvements of nearly 40% under white-box attacks and nearly 35% under black-box attacks.
🔥 Key Features and Contributions
-
Theoretical Framework: Introduces the Information Bottleneck (IB) principle into dataset distillation, leveraging the Conditional Entropy Bottleneck (CEB) to incorporate adversarial robustness as a prior.
-
Algorithm Design: Proposes performance-aligned and robustness-aligned terms to balance model accuracy and adversarial robustness, enhanced by robust priors from pretrained models.
-
Evaluation and Validations: Introduces I-RR and achieves up to 40% and 35% robustness gains under white-box and black-box attacks on CIFAR datasets.
📈 Experimental Results
We evaluate and compare the adversarial robustness of ROME and other DD methods against both white-box and black-box attacks, under both targeted and untargeted settings:

🛠 Getting Started
Follow these steps to set up the environment and run the code.
Step 1: Clone the Repository
- Run the following command to download the repository:
git clone https://github.com/zhouzhengqd/ROME.git
Step 2: Download Datasets
- Download the CIFAR-10/100 datasets from the official source, or use the shared download link provided by BEARD for quicker access. Place them in the relevant directory.
Step 3: Set Up the Conda Environment
- Run the following commands to create and activate the conda environment:
cd ROME cd Code conda env create -f environment.yml conda activate rome
📁 Directory Structure
ROMECodedatadatasets
checkpointsresult- Files for ROME
command.txtenviroment.yml- ...
- ...
- ...
🌟 Command for Reproducing Experiment Results and Evaluation
Training the Distilled Datasets
Follow the training command in the command.txt. For example, to train ROME on CIFAR-10 with IPC-50, run the following command:
python3 -u ROME_cifar10.py --dataset CIFAR10 --model ConvNet --ipc 50 --dsa_strategy color_crop_cutout_flip_scale_rotate --init real --lr_img 0.2 --num_exp 5 --num_eval 5 --net_train_real --eval_interval 500 --outer_loop 1 --mismatch_lambda 0 --net_decay --embed_last 1000 --syn_ce --ce_weight 0.1 --train_net_num 1 --aug
Evaluating the Distilled Datasets
Follow the BEARD benchmark configuration:
- Step 1: Download the BEARD repository.
- Step 2: Download the Distilled Dataset and Model, and follow the BEARD instructions for quick evaluation.
- Step 3: Replace the distilled datasets with your own finished training results.
🙏 Acknowledgments
We would like to thank the contributors of the following projects that inspired and supported this work: DC, DSA, DM, MTT, IDM, BACON, and BEARD.
🎓 Citation
@inproceedings{zhou2025rome,
title={ROME is Forged in Adversity: Robust Distilled Datasets via Information Bottleneck},
author = {Zhou, Zheng and Feng, Wenquan and Zhang, Qiaosheng and Lyu, Shuchang and Zhao, Qi and Cheng, Guangliang},
booktitle={International Conference on Machine Learning (ICML)},
year={2025}
}