MOS: Modeling Object-Scene Associations in Generalized Category Discovery (CVPR 2025)

April 13, 2025 · View on GitHub

Welcome to the official repository for the MOS: Modeling Object-Scene Associations in Generalized Category Discovery project!

Running

Dependencies

pip install -r requirements.txt

We recommend using the same configuration as ours: Python 3.8, CUDA > 12, and torch 2.3.1.

Datasets

We use fine-grained benchmarks in this paper, including:

The Semantic Shift Benchmark (SSB) and Oxford-IIIT Pet Dataset

In addition, we need to extract the mask for each image (where pixel value 255 represents the object and 0 represents the scene). Please follow the IS-Net for this process (model is isnet-general-use). Alternatively, you can use the pre-processed masks that we have already prepared. The Google Drive link is link.

The placement of the mask foler is as follows:

For cub: your_path/cub/masks
For stanford_car: your_path/stanford_car/cars_train_mask and your_path/stanford_car/cars_test_mask
For aircraft: your_path/fgvc-aircraft-2013b/data/masks
For oxford-pet: your_path/Oxford-pet/data/masks

Scripts

Train the model:

bash scripts/run_${DATASET_NAME}.sh

Please note that in the .sh file, you need to specify the root directory of the dataset and DINO weight.

Checkpoints

You can contact pengzhengyuan@sjtu.edu.cn to obtain logs and checkpoints from multiple experiments for any dataset. Feel free to reach out.

Note

Please note that we have commented out the last norm layer in the DINO backbone.

Citing this work

If you find this repo useful for your research, please consider citing our paper:

@inproceedings{peng2025mos,
  title={MOS: Modeling Object-Scene Associations in Generalized Category Discovery},
  author={Peng, Zhengyuan and Ma, Jinpeng and Sun, Zhimin and Yi, Ran and Song, Haichuan and Tan, Xin and Ma, Lizhuang},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2025}
}

Acknowledgements

The codebase is largely built on this repo: SimGCD.

Contact

For inquiries or further information, contact: pengzhengyuan@sjtu.edu.cn

Happy coding!