Single Domain Generalization for Crowd Counting

March 31, 2025 · View on GitHub

This is an official repository for our CVPR2024 work, "Single Domain Generalization for Crowd Counting". You can read our paper here.

Requirements

Python 3.10.12
PyTorch 2.0.1
Torchvision 0.15.2
Others specified in requirements.txt

Data Preparation

Download ShanghaiTech and UCF-QNRF datasets from official sites and unzip them.

Run the following commands to preprocess the datasets:

python utils/preprocess_data.py --dataset sta --origin-dir [path_to_ShanghaiTech]/part_A --data-dir data/sta
python utils/preprocess_data.py --dataset stb --origin-dir [path_to_ShanghaiTech]/part_B --data-dir data/stb
python utils/preprocess_data.py --dataset qnrf --origin-dir [path_to_UCF-QNRF] --data-dir data/qnrf

Run the following commands to generate GT density maps:

python utils/dmap_gen.py --path data/sta
python utils/dmap_gen.py --path data/stb
python utils/dmap_gen.py --path data/qnrf

Training

Run the following command:

python main.py --task train --config configs/sta_train.yml

You may edit the .yml config file as you like.

Testing

Run the following commands after you specify the path to the model weight in the config file:

python main.py --task test --config configs/sta_test_stb.yml
python main.py --task test --config configs/sta_test_qnrf.yml

Inference

Run the following command:

python inference.py --img_path [path_to_img_file_or_directory] --model_path [path_to_model_weight] --save_path output.txt --vis_dir vis

We provide pretrained weights that produce the results in our original paper in the table below. Note that due to the use of non-deterministic torch.nn.functional.interpolate, the results are not reproducible by training from scratch.

Source	Performance	Weights
A	B: 11.4MAE, 19.7MSE Q: 115.7MAE, 199.8MSE	OneDrive Google Drive
B	A: 99.6MAE, 182.9MSE Q: 165.6MAE, 290.4MSE	OneDrive Google Drive
Q	A: 65.5MAE, 110.1MSE B: 12.3MAE, 24.1MSE	~~OneDrive~~ Google Drive

New Deterministic Version

To make the results fully reproducible, we implement a new version of MPCount which does not use torch.nn.functional.interpolate. We will release more results and weights soon.

Source	Performance	Weights
A	B: 11.2MAE, 20.0MSE Q: 112.8MAE, 193.8MSE	OneDrive Google Drive

Citation

If you find this work helpful in your research, please cite the following:

@inproceedings{pengMPCount2024,
  title = {Single Domain Generalization for Crowd Counting},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)},
  author = {Peng, Zhuoxuan and Chan, S.-H. Gary},
  year = {2024}
}