Enhancing Recipe Retrieval with Foundation Models

October 20, 2024 · View on GitHub

Official implementation of our ECCV2024 paper:

Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective

This paper proposes a new perspective on data augmentation using the Foundation Model (i.e., llama2 and SAM) to better learn multimodal representations in the common embedding space for the task of cross-modal recipe retrieval.

Project Banner


Installation

To install the required packages, please follow these steps:

# Clone the repository
git clone https://github.com/Noah888/DAR.git

# Create a virtual environment (Python 3.8 or above)
conda create --name your_env_name python=3.9

# Activate the conda environment
conda activate your_env_nam

# Install dependencies
pip install -r requirements.txt
cd src

Dataset

To reproduce the results, Download Recipe1M dataset and Generate enhanced data (traindata (add visual imagination data) and segment data). Place the data in the DATASET_PATH directory with the following structure:

DATASET_PATH/
│── traindata/
├── train/
   ├── ...
├── val/
   ├── ...
└── test/
   ├── ...
├── segment/
    ├── train/...
    ├── val/...
    ├── test/...
└── layer1.json
└── layer2.json

Training

  • Launch training with:
python train.py --model_name model --root DATASET_PATH --save_dir /path/to/saved/model/checkpoints

Run python train.py --help for the full list of available arguments.

Evaluation

  • Extract features from the trained model for the test set samples of Recipe1M:
python test.py --model_name model --eval_split test --root DATASET_PATH --save_dir /path/to/saved/model/checkpoints
  • Compute MedR and recall metrics for the extracted feature set: Evaluation with only image and recipe feats(DAR):
python eval.py --embeddings_file /path/to/saved/model/checkpoints/model/feats_test.pkl --medr_N 10000

Evaluation with raw image-recipe features as well as augment segments description features (DAR++):

python eval_add_augment.py --embeddings_file /path/to/saved/model/checkpoints/model/feats_test.pkl --medr_N 10000

Pretrained models

  • We provide pretrained model weights DAR_model:
python test.py --model_name DAR_model --eval_split test --root DATASET_PATH --save_dir ../checkpoints
  • A file with extracted features will be saved under ../checkpoints/DAR_model.

This code is based on the image-to-recipe-transformers. We would like to express our gratitude.