ASRF : Video Action Segmentation Model
February 22, 2022 · View on GitHub
简体中文 | English
ASRF : Video Action Segmentation Model
Contents
Introduction
ASRF model is an improvement on the video motion segmentation model ms-tcn, which was published on WACV in 2021. We reproduce the officially implemented pytorch code and obtain approximate results in paddlevideo.
MS-TCN Overview
Data
ASRF can choose 50salads, breakfast, gtea as trianing set. Please refer to Video Action Segmentation dataset download and preparation doc Video Action Segmentation dataset
Unlike MS-TCN, ASRF model requires additional data construction. The script process is as follows
python data/50salads/prepare_asrf_data.py --dataset_dir data/
Train
After prepare dataset, we can run sprits.
# gtea dataset
export CUDA_VISIBLE_DEVICES=3
python3.7 main.py --validate -c configs/segmentation/asrf/asrf_gtea.yaml
- Start the training by using the above command line or script program. There is no need to use the pre training model. The video action segmentation model is usually a full convolution network. Due to the different lengths of videos, the
DATASET.batch_sizeof the video action segmentation model is usually set to1, that is, batch training is not required. At present, only single sample training is supported.
Test
Test MS-TCN on dataset scripts:
python main.py --test -c configs/segmentation/asrf/asrf_gtea.yaml --weights=./output/ASRF/ASRF_split_1.pdparams
- The specific implementation of the index is to calculate ACC, edit and F1 scores by referring to the test scriptevel.py provided by the author of ms-tcn.
The reproduction of pytorch comes from the official code base
- The evaluation method of data set adopts the folding verification method in ms-tcn paper, and the division method of folding is the same as that in ms-tcn paper.
Accuracy on Breakfast dataset(4 folding verification):
| Model | Acc | Edit | F1@0.1 | F1@0.25 | F1@0.5 |
|---|---|---|---|---|---|
| paper | 67.6% | 72.4% | 74.3% | 68.9% | 56.1% |
| pytorch | 65.8% | 71.0% | 72.3% | 66.5% | 54.9% |
| paddle | 66.1% | 71.9% | 73.3% | 67.9% | 55.7% |
Accuracy on 50salads dataset(5 folding verification):
| Model | Acc | Edit | F1@0.1 | F1@0.25 | F1@0.5 |
|---|---|---|---|---|---|
| paper | 84.5% | 79.3% | 82.9% | 83.5% | 77.3% |
| pytorch | 81.4% | 75.6% | 82.7% | 81.2% | 77.2% |
| paddle | 81.6% | 75.8% | 83.0% | 81.5% | 74.8% |
Accuracy on gtea dataset(4 folding verification):
| Model | Acc | Edit | F1@0.1 | F1@0.25 | F1@0.5 |
|---|---|---|---|---|---|
| paper | 77.3% | 83.7% | 89.4% | 87.8% | 79.8% |
| pytorch | 76.3% | 79.6% | 87.3% | 85.8% | 74.9% |
| paddle | 77.1% | 83.3% | 88.9% | 87.5% | 79.1% |
Model weight for gtea
| Test_Data | F1@0.5 | checkpoints |
|---|---|---|
| gtea_split1 | 72.4409 | ASRF_gtea_split_1.pdparams |
| gtea_split2 | 76.6666 | ASRF_gtea_split_2.pdparams |
| gtea_split3 | 84.5528 | ASRF_gtea_split_3.pdparams |
| gtea_split4 | 82.6771 | ASRF_gtea_split_4.pdparams |
Infer
export inference model
python3.7 tools/export_model.py -c configs/segmentation/asrf/asrf_gtea.yaml \
-p data/ASRF_gtea_split_1.pdparams \
-o inference/ASRF
To get model architecture file ASRF.pdmodel and parameters file ASRF.pdiparams, use:
- Args usage please refer to Model Inference.
infer
Input file are the file list for infering, for example:
S1_Cheese_C1.npy
S1_CofHoney_C1.npy
S1_Coffee_C1.npy
S1_Hotdog_C1.npy
...
python3.7 tools/predict.py --input_file data/gtea/splits/test.split1.bundle \
--config configs/segmentation/asrf/asrf_gtea.yaml \
--model_file inference/ASRF/ASRF.pdmodel \
--params_file inference/ASRF/ASRF.pdiparams \
--use_gpu=True \
--use_tensorrt=False
example of logs:
result write in : ./inference/infer_results/S1_Cheese_C1.txt
result write in : ./inference/infer_results/S1_CofHoney_C1.txt
result write in : ./inference/infer_results/S1_Coffee_C1.txt
result write in : ./inference/infer_results/S1_Hotdog_C1.txt
result write in : ./inference/infer_results/S1_Pealate_C1.txt
result write in : ./inference/infer_results/S1_Peanut_C1.txt
result write in : ./inference/infer_results/S1_Tea_C1.txt
Reference
- Alleviating Over-segmentation Errors by Detecting Action Boundaries, Yuchi Ishikawa, Seito Kasai, Yoshimitsu Aoki, Hirokatsu Kataoka