GDumb

June 5, 2023 · View on GitHub

This repository contains simplified code for the paper:

GDumb: A Simple Approach that Questions Our Progress in Continual Learning, ECCV 2020 (Oral: Top 2%)
Ameya Prabhu, Philip Torr, Puneet Dokania

[PDF] [Slides] [Bibtex]

Installation and Dependencies

  • Install all requirements required to run the code on a Python 3.x by:
# First, activate a new virtual environment
$ pip3 install -r requirements.txt
  • Create two additional folders in the repository data/ and logs/ which will store the datasets and logs of experiments. Point --data_dir and --log_dir in src/opts.py to locations of these folders.

  • Select Imagenet100 from Imagenet using this link and TinyImagenet from this link and convert them to ImageFolder format with train and test splits.

Usage

  • To run the GDumb model you can simply specify conditions from argument, an example command below:
$ python main.py --dataset CIFAR100 --num_classes_per_task 5 --num_tasks 20 --memory_size 500 --num_passes 256 --regularization cutmix --model ResNet --depth 32 --exp_name my_experiment_name

Arguments you can freely tweak given a dataset and model:

  • Number of classes per task (--num_classes_per_task)
  • Number of tasks (--num_tasks)
  • Maximum memory size (--memory_size)
  • Number of classes to pretrain a dumb model (--num_pretrain_classes)
  • Number of passes through the memory for learning the dumb model and pretraining (--num_passes and --num_pretrain_passes)

To add your favorite dataset:

  • Convert it to ImageFolder format (as in imagenet) with train and test folders
  • Add the dataset folder name exactly to src/opts.py
  • Add dataset details to get_statistics() function in src/dataloader.py
  • Run you model with --dataset your_fav_dataset!

Additional details and default hyperparameters can be found in src/opts.py

  • To replicate the complete set of experiments, copy scripts/replicate.sh to src/ and run with substituting $SEED with {0,1,2}:
$ bash replicate.sh $SEED

Similarly, other scripts can replicate results for specific formulations.

Results

After running replicate.sh you should get results somewhat like these:

GDumb ModelMem (k)TableAccuracy
MNIST-MLP-1003003,889.1 ± 0.4
MNIST-MLP-100500390.2 ± 0.4
MNIST-MLP-400500491.9 ± 0.5
MNIST-MLP-40044005,697.8 ± 0.1
SVHN-ResNet184400393.4 ± 0.1
CIFAR10-ResNet18200335.0 ± 0.4
CIFAR10-ResNet185003,4,845.4 ± 1.9
CIFAR10-ResNet1810003,461.2 ± 1.0
CIFAR100-ResNet322000524.3 ± 0.4
TinyImageNet-DenseNet-100-12-BC9000657.32 (best of 3)

Extensibility to other setups

  • Settings can be tweaked by adjusting the above parameters. Additionally, GDumb can be used in a wide-variety of settings beyond current CL formulations:
    • GDumb is sortof robust against drastic variations in sample orders given the same/similar set of samples land in memory, hence this implementation abstracts the sampling process out.
    • Masking is to be used to handle dynamic variations to likely subset of classes, adding class-priors, handling scenarios like cost-sensitive classification
If you discover any bugs in the code please contact me, I will cross-check them with my nightmares.

Citation

We hope GDumb is a strong baseline and comparison, and the sampler or masking introduced are useful for your cool CL formulation! To cite our work:

@inproceedings{prabhu2020greedy,
  title={GDumb: A Simple Approach that Questions Our Progress in Continual Learning},
  author={Prabhu, Ameya and Torr, Philip and Dokania, Puneet},
  booktitle={The European Conference on Computer Vision (ECCV)},
  month={August},
  year={2020}
}