README.md

February 11, 2026 Β· View on GitHub

pipeline


ArXiv Static Badge Static Badge Github Static Badge Static Badge

Welcome to the official PyTorch implementation of BEARD: Benchmarking the Adversarial Robustness for Dataset Distillation.

BEARD is an open-source benchmark specifically designed to evaluate and improve the adversarial robustness of Dataset Distillation (DD) methods. It provides a comprehensive assessment across three key stages: distillation, training, and evaluation.

pipeline

πŸ”Ή Explore the official leaderboard here: BEARD Leaderboard

❗Note❗: If you encounter any issues, please feel free to contact us via email

πŸš€ What's New?

  • Feb. 2026: Released an automated metric computation toolkit for RR, AE, and CREI under Code/metrics/. See Code/metrics/README.md for detailed instructions.
  • Mar. 2025: We have updated our attack library with transfer-based and query-based black-box attacks along with their evaluation files.
  • Sep. 2024: The full BEARD codebase is now open-source! πŸŽ‰ Access it here: BEARD GitHub Repository.
  • Aug. 2024: The first full release of the BEARD benchmark project.

🎯 Overview of BEARD

pipeline

BEARD addresses a critical gap in dataset distillation research by providing a systematic framework for evaluating adversarial robustness. While significant progress has been made in DD, deep learning models trained on distilled datasets remain vulnerable to adversarial attacks, posing risks in real-world applications.

πŸ”₯ Key Features:

  • Unified Benchmark: Evaluate DD methods across multiple datasets and attack scenarios.
  • New Evaluation Metrics: Includes the Robustness Ratio (RR), Attack Efficiency Ratio (AE), and Comprehensive Robustness-Efficiency Index (CREI).
  • Open-Source Tools: Easily integrate and evaluate the robustness of your DD methods with BEARD's extensible framework.

πŸ›  Getting Started

Follow the steps below to set up the environment and run the BEARD benchmark.

Step 1: Clone the Repository

  • Run the following command to download the Repo.
    git clone https://github.com/zhouzhengqd/BEARD.git
    

Step 2: Download Dataset and Model Pools

Step 3: Set Up the Conda Environment

  • Run the following command to create a conda environment
    cd BEARD
    cd Code
    conda env create -f environment.yml
    conda activate beard
    

πŸ“ Directory Structure

  • BEARD
    • Code
      • data
        • datasets
      • dataset_pool
      • model_pool
      • metrics
      • evaluate_model.py
      • train_model.py
      • evaluate_model_blackbox.py
      • evaluate_config.json
      • train_config.json
      • evaluate_config_blackbox.json
      • Files for BEARD
      • environment.yml
      • ...
      • ...
      • ...

🚦 Quick Evaluation Command

Step 1: Download Dataset and Model Pools

  • Ensure you have downloaded the dataset and model pools from the links provided above.

Step 2: Modify Evaluation Configuration

  • Adjust the evaluation configuration by editing the evaluate_config.json file based on your requirements.

Step 3: Run the Evaluation Script

  • Execute the evaluation to assess adversarial robustness:
      python evaluate_model.py --config ./evaluate_config.json
    
  • To evaluate transfer-based black-box attacks, use:
      python evaluate_model_blackbox.py --config ./evaluate_config_blackbox.json
    
  • Note: If your model was trained using distributed training, ensure that you also use the corresponding distributed evaluation setup for consistency. For instance, IDM in the model pool is trained with distributed training, so we provide a single-GPU version in the model pool for evaluation.

πŸ“Š Robustness Metrics Toolkit

We provide an automated metrics toolkit under Code/metrics/ for robustness evaluation in dataset distillation. It computes RR (Robustness Ratio), AE (Attack Efficiency), and CREI across white-box, black-box query, and black-box transfer attack settings, and exports results in JSON and XLSX/CSV formats.

πŸ“Œ Quick Link: See detailed usage in Code/metrics/README.md.

⚠️ Note: For transfer-based black-box logs, run convert_to_white_box.py first, then evaluate with evaluate_metrics.py (see the toolkit README for commands).

βž• Adding New Datasets and Models

Step 1: Add Datasets

  • Place the newly generated distilled datasets in the dataset_pool directory.

Step 2: Modify Training Configuration

  • Adjust the training configuration by editing the train_config.json file to specify the new datasets.

Step 3: Run the Training Script

  • Train the models on the new datasets:
      python train_model.py --config ./train_config.json
    

Step 4: Evaluate the Models

  • Once the models are trained, follow the evaluation steps outlined in the "Quick Evaluation Command" section to evaluate adversarial robustness.

🌐 Join the Community

If you're working on DD or adversarial robustness, we invite you to contribute to the BEARD benchmark, explore the leaderboard, and share your insights.

πŸ™ Acknowledgments

We would like to thank the contributors of the following projects that inspired and supported this work: