Temporal Action Detection
October 8, 2024 ยท View on GitHub
We use the ActionFormer detection pipeline as our baseline method and replace its I3D feature with the feature extracted by VideoMAE V2-g.
| Dataset | Backbone | Head | mAP | Features |
|---|---|---|---|---|
| THUMOS14 | VideoMAE V2-g | ActionFormer | 69.6 | th14_mae_g_16_4.tar.gz |
| FineAction | VideoMAE V2-g | ActionFormer | 18.2 | fineaction_mae_g.tar.gz |
Extract feature
Use extract_tad_feature.py to extract the feature of datasets. For example, to extract the feature of THUMOS14, running the following command:
python extract_tad_feature.py \
--data_set THUMOS14 \
--data_path YOUR_PATH/thumos14_videos \
--save_path YOUR_PATH/th14_vit_g_16_4 \
--model vit_giant_patch14_224 \
--ckpt_path YOUR_PATH/vit_g_hyrbid_pt_1200e_k710_ft.pth