CLIP-Driven Fine-grained Text-Image Person Re-identification

November 22, 2023 · View on GitHub

Official PyTorch implementation of the paper "CLIP-Driven Fine-grained Text-Image Person Re-identification".

Usage

Requirements

we use single RTX3090 24G GPU for training and evaluation.

Python 3.6.9
pytorch 1.7.0
torchvision 0.8.1
scipy 1.2.1

Dataset Preparation

Download the CUHK-PEDES dataset from here, ICFG-PEDES dataset from here and RSTPReid dataset form here

Organize them in your dataset root dir folder as follows:

|-- your dataset root dir/
|   |-- <CUHK-PEDES>/
|       |-- imgs
|            |-- cam_a
|            |-- cam_b
|            |-- ...
|       |-- reid_raw.json
|
|   |-- <ICFG-PEDES>/
|       |-- imgs
|            |-- test
|            |-- train 
|       |-- ICFG_PEDES.json
|
|   |-- <RSTPReid>/
|       |-- imgs
|       |-- data_captions.json

Data Preparation

Run data.sh (or Download from here)
Copy files test_reid.json, train_reid.json and val_reid.json to project_directory/cuhkpedes/processed_data/

Training

python train.py

Testing

python test.py

Acknowledgments

Our code is extended from the following repositories. We sincerely appreciate for their contributions.

CLIP
SAF

Citation

If you find this code useful for your research, please cite our paper.

@article{CFine,
   title={CLIP-Driven Fine-grained Text-Image Person Re-identification}, 
   author={Shuanglin Yan and Neng Dong and Liyan Zhang and Jinhui Tang},
   journal={IEEE Transactions on Image Processing}, 
   year={2023},
   volume={},
   number={},
   pages={1-14},
   doi={10.1109/TIP.2023.3327924}
}

Contact

If you have any question, please feel free to contact us. E-mail: shuanglinyan@njust.edu.cn.