Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search

April 16, 2021 ยท View on GitHub

We have reimplemented the method in our YouReID framework. The refactored code is simple and easy to read. More information can be found in YouReID/NAFS

This is an implementation for our paper Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search. The code is modified from Github repositoty "pytorch implementation for ECCV2018 paper Deep Cross-Modal Projection Learning for Image-Text Matching".

Requirement

  • Python 3.7
  • Pytorch 1.0.0 & torchvision 0.2.1
  • numpy
  • matplotlib (not necessary unless the need for the result figure)
  • scipy 1.2.1
  • pytorch_transformers

Usage

Data Preparation

  1. Please download CUHK-PEDES dataset .
  2. Put reid_raw.json under project_directory/data/
  3. run data.sh
  4. Copy files test_reid.json, train_reid.json and val_reid.json under CUHK-PEDES/data/ to project_directory/data/processed_data/
  5. Download pretrained Resnet50 model, bert-base-uncased model and vocabulary to project_directory/pretrained/

Training & Testing

You should firstly change the parameter BASE_ROOT to your current directory and IMAGE_DIR to the directory of CUHK-PEDES dataset. Run command sh scripts/train.sh to train the model. Run command sh scripts/test.sh to evaluate the model.

Model Framework

Framework

Pretrained Model

Model (Google Drive)

Training log (Google Drive)

Model Performance

Performance0 Performance0