README.md

July 16, 2025 · View on GitHub

Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition

Kun Li, Dan Guo, Guoliang Chen, Chunxiao Fan, Jingyuan Xu, Zhiliang Wu, Hehe Fan, Meng Wang

Hefei University of Technology, Zhejiang University

GitHub issues GitHub stars

🛠️ Installation

conda create --name openmmlab python=3.8 -y
conda activate openmmlab
conda install pytorch torchvision -c pytorch  # This command will automatically install the latest version PyTorch and cudatoolkit, please check whether they match your environment.
pip install -U openmim
mim install mmengine
mim install mmcv
mim install mmdet  # optional
mim install mmpose  # optional
git clone https://github.com/kunli-cs/PCAN.git
cd ./PCAN/mmaction2
pip install -v -e .

Data Preparation

Download the MA-52 RGB data and Pose data.

pip install -U huggingface_hub
## use hf-mirror to accelerate
export HF_ENDPOINT=https://hf-mirror.com

## download RGB data
huggingface-cli download --repo-type dataset --resume-download kunli-cs/MA-52 --local-dir ./data/ma52
mkdir -p ./data/ma52/raw_videos && unzip ./data/ma52/train.zip -d ./data/ma52/raw_videos && rm ./data/ma52/train.zip
mkdir -p ./data/ma52/raw_videos && unzip ./data/ma52/val.zip -d ./data/ma52/raw_videos && rm ./data/ma52/val.zip
mkdir -p ./data/ma52/raw_videos && unzip ./data/ma52/test.zip -d ./data/ma52/raw_videos && rm ./data/ma52/test.zip

## download Pose data
huggingface-cli download --repo-type dataset --resume-download kunli-cs/MA-52_openpose_28kp --local-dir ./data/ma52/MA-52_openpose_28kp

Download the pre-trained weights and checkpoint.

huggingface-cli download --repo-type dataset --resume-download kunli-cs/PCAN_weights --local-dir ./checkpoints

Training

Step 1. Pretraining

We following the RGBPoseConv3D to pretraining PCAN.

You first need to train the RGB-only and Pose-only model on the MA-52 dataset, the pretrained checkpoints will be used to initialize the RGBPoseConv3D model.

You can use the provided IPython notebook to merge two pretrained models into a single rgbpose_conv3d_init.pth.

You can do it your own or directly download and use the provided rgbpose_conv3d_init.pth.

Step 2. Training

python tools/train.py configs/skeleton/posec3d/rgbpose_conv3d/rgbpose_conv3d.py

Evaluation

## export the result.pkl on the test set.
python tools/test.py ./configs/skeleton/posec3d/rgbpose_conv3d/rgbpose_conv3d.py \
    pretrained/PCAN_checkpoint_7c4fba7c.pth --dump eval_ma52/result.pkl

## build the set set results `prediction.csv` in csv format.
python eval_ma52/eval_test.py

Please submit the test predictions ./eval_ma52/submission.zip to the Codabench evaluation server.

📞 Contact Authors

If you have any questions or suggestions, please do not hesitate to contact Kun Li.

🖊️ Citation

If you found this code useful, please consider cite:

@inproceedings{li2025prototypical,
  title={Prototypical calibrating ambiguous samples for micro-action recognition},
  author={Li, Kun and Guo, Dan and Chen, Guoliang and Fan, Chunxiao and Xu, Jingyuan and Wu, Zhiliang and Fan, Hehe and Wang, Meng},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={39},
  number={5},
  pages={4815--4823},
  year={2025}
}

@article{guo2024benchmarking,
  title={Benchmarking Micro-action Recognition: Dataset, Methods, and Applications},
  author={Guo, Dan and Li, Kun and Hu, Bin and Zhang, Yan and Wang, Meng},
  journal={IEEE Transactions on Circuits and Systems for Video Technology},
  year={2024},
  volume={34},
  number={7},
  pages={6238-6252},
}

@misc{2020mmaction2,
    title={OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark},
    author={MMAction2 Contributors},
    howpublished = {\url{https://github.com/open-mmlab/mmaction2}},
    year={2020}
}

@misc{duan2021revisiting,
      title={Revisiting Skeleton-based Action Recognition},
      author={Haodong Duan and Yue Zhao and Kai Chen and Dian Shao and Dahua Lin and Bo Dai},
      year={2021},
      eprint={2104.13586},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

🤝 Acknowledgement

This code began with mmaction2. We thank the developers for doing most of the heavy-lifting.