🌟 Breaking Class Barriers: Efficient Dataset Distillation via Inter-Class Feature Compensator

April 6, 2025 Β· View on GitHub

πŸ”₯ ICLR 2025 Poster πŸ”₯

Breaking Class Barriers: Efficient Dataset Distillation via Inter-Class Feature Compensator.
Xin Zhang, Jiawei Du, Ping Liu, Joey Tianyi Zhou
Agency for Science, Technology, and Research (ASTAR), Singapore
University of Nevada, Reno

πŸ“– Introduction

Pre-trained Models

Left: Overview of dataset distillation paradigms. The first illustrates the traditional ``one instance for one class'' approach, where each instance is optimized exclusively for its pre-assigned label, creating implicit class barriers. The second illustrates our INFER method, designed for ``one instance for ALL classes'' distillation. Right: t-SNE visualization of the decision boundaries between the traditional approaches (i.e., SRe2L) and our INFER approach. We randomly select seven classes from CIFAR-100 dataset for the visualization. INFER forms thin and clear decision boundaries among classes, in contrast to the chaotic decision boundaries of the traditional approach.


βš™οΈ Installation

To get started, follow these instructions to set up the environment and install dependencies.

  1. Clone this repository:

    git clone https://github.com/zhangxin-xd/UFC.git
    cd UFC
    
  2. Install required packages: You don’t need to create a new environment; simply ensure that you have compatible versions of CUDA and PyTorch installed.


πŸš€ Usage

Here’s how to use this code for UFC generation and validation:

  • Preparation: For ImageNet-1K, we use the pre-trained weights available in torchvision. For CIFAR and Tiny-ImageNet, we provide the trained weights at this link. Alternatively, you can train the models yourself by following the instructions in Diversity-Driven-Synthesis.

  • Generation: Before performing distillation, please first prepare the images by randomly sampling from the original dataset and saving them as tensors. We provide the tensor-formatted initialization images at this link .

    Cifar:

    python ufc_generation/ufc_cifar.py \
        --iteration 1000 --r-bn 1 --batch-size 100 \
        --lr 0.25 --ipc 10 \
        --exp-name generated_results \
        --wandb-name cifar100-ipc10 \
        --store-best-images \
        --syn-data-path syn/ \
        --init_path init_images/c100/ \
        --dataset cifar100
    

    ImageNet-1K:

    python ufc_generation/ufc_imgnet.py \
        --iteration 2000 --r-bn 0.01 --batch-size 1000 \
        --lr 0.25 --ipc 10 \
        --exp-name generated_results \
        --wandb-name imagenet-ipc10 \
        --store-best-images \
        --syn-data-path syn/ \
        --init_path init_images/imagenet/ \
        --dataset imagenet
    
  • Evaluation:

    validation with static labeling

    python ufc_validation/val_static.py \
        --epochs 400 --batch-size 64 --ipc 10 \
        --syn-data-path syn/cifar100-ipc10/generated_results \
        --output-dir syn/cifar100-ipc10/generated_results \
        --wandb-name cifar100-ipc10 \
        --dataset cifar100 --networks resnet18
    

    validation with dynamic labeling

    Note: the number of training epochs is reduced by a factor of 1/(M + 1).

      python ufc_validation/val_dyn.py \
          --epochs 80 --batch-size 64 --ipc 10 \
          --syn-data-path syn/cifar100-ipc10/generated_results\
          --output-dir syn/cifar100-ipc10 \
          --wandb-name cifar100-ipc10 \
          --dataset cifar100 --networks resnet18
    

we also provide the .sh script in the sh directory.


πŸ“Š Results

Our experiments demonstrate the effectiveness of the proposed approach across various benchmarks.

Results 1 Results 2

For detailed experimental results and further analysis, please refer to the full paper.


πŸ“‘ Citation

If you find this code useful in your research, please consider citing our work:

@inproceedings{ufc2025iclr,
    title={Breaking Class Barriers: Efficient Dataset Distillation via Inter-Class Feature Compensator},
    author={Zhang, Xin and Du, Jiawei and Liu, Ping and Zhou, Joey Tianyi},
    booktitle={Proc. Int. Conf. Learn. Represent. (ICLR)},
    year={2025}
}

πŸŽ‰ Reference

Our code has referred to previous work:

Squeeze, Recover and Relabel: Dataset Condensation at ImageNet Scale From A New Perspective