README.md
March 6, 2024 Β· View on GitHub
This repository is the official implementation of Side4Video, which significantly reduces the training memory cost for action recognition and text-video retrieval tasks.
π° News
-
Feb 28, 2024.We release our code for Action Recognition and Text-Video Retrieval. -
Nov 28, 2023.We release our paper in arxiv.
πΊοΈ Overview
π Training and Testing
For training and testing our model, please refer to the Recognition and Retrieval folders.
π Results
ποΈ Citation
If you find this repository is useful, please starπ this repo and citeποΈ our paper.
@article{yao2023side4video,
title={Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning},
author={Yao, Huanjin and Wu, Wenhao and Li, Zhiheng},
journal={arXiv preprint arXiv:2311.15769},
year={2023}
}
π Acknowledgment
Our implementation is mainly based on the following codebases. We are sincerely grateful for their work.
- Text4Vis: Revisiting Classifier: Transferring Vision-Language Models for Video Recognition.
- CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval.
π§ Contact
If you have any questions about this repository, please file an issue or contact Huanjin Yao or Wenhao Wu
.