Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search

April 16, 2021 · View on GitHub

We have reimplemented the method in our YouReID framework. The refactored code is simple and easy to read. More information can be found in YouReID/NAFS

This is an implementation for our paper Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search. The code is modified from Github repositoty "pytorch implementation for ECCV2018 paper Deep Cross-Modal Projection Learning for Image-Text Matching".

Requirement

Python 3.7
Pytorch 1.0.0 & torchvision 0.2.1
numpy
matplotlib (not necessary unless the need for the result figure)
scipy 1.2.1
pytorch_transformers

Usage

Data Preparation

Please download CUHK-PEDES dataset .
Put reid_raw.json under project_directory/data/
run data.sh
Copy files test_reid.json, train_reid.json and val_reid.json under CUHK-PEDES/data/ to project_directory/data/processed_data/
Download pretrained Resnet50 model, bert-base-uncased model and vocabulary to project_directory/pretrained/

Training & Testing

You should firstly change the parameter BASE_ROOT to your current directory and IMAGE_DIR to the directory of CUHK-PEDES dataset. Run command sh scripts/train.sh to train the model. Run command sh scripts/test.sh to evaluate the model.

Model Framework

Framework

Pretrained Model

Model (Google Drive)

Training log (Google Drive)

Model Performance

Performance0