Language-guided Human Motion Synthesis with Atomic Actions
June 28, 2024 ยท View on GitHub

Language-guided Human Motion Synthesis with Atomic Actions
Yuanhao Zhai, Mingzhen Huang, Tianyu Luan, Lu Dong, Ifeoma Nwogu, Siwei Lyu, David Doermann, and Junsong Yuan
University at Buffalo
ACM MM 2023
This repo contains the our PyTorch implementation on the text-to-motion synthesis task.
1. Environment setup
Install ffmpeg
sudo apt update
sudo apt install ffmpeg
Setup conda environment
pip install -r requirements.txt
python -m spacy download en_core_web_sm
Download dependency for text-to-motion synthesis
bash prepare/download_smpl_files.sh
bash prepare/download_glove.sh
bash prepare/download_t2m_evaluators.sh
2. Dataset preparation
Please follow MDM to setup the dataset.
3. Training
For training on the HumanML3D dataset, run the following command.
python -m train.train_cvae --save_dir save/humanml --overwrite --dataset humanml --eval_during_training --kld_w 1e-2 --att_spa_w 1e-2 --codebook_norm_w 1e-2 --mask_ratio 0.5 --mask_sched linear
For the KIT dataset, add --dataset kit flag. Please also change the save_dir accordingly.
Besides, we also provide option to use better CLIP encoder by adding --use_transformers_clip flag, which potentially gives better performance.
4. Evaluation
We provide our pretrain checkpoint here. To reproduce our result, run the following command. For HumanML3D:
python -m eval.eval_humanml_cvae --model_path {path-to-humanml-pretrained-model} --dataset humanml --eval_mode mm_short
For KIT:
python -m eval.eval_humanml_cvae --model_path {path-to-kit-pretrained-model} --dataset kit --num_code 512 --eval_mode mm_short
5. Text-to-motion synthesis
Run the following command to generate a motion given a text prompt. The output contains stick figure animation for the generated motion, and a .npy file containing the xyz coordinates of the joints.
python -m sample.generate_cvae --modal_path {checkpoint-path} --text_prompt {text prompt}
6. Rendering
Use the following command to create the .obj SMPL mesh file.
python -m visualize.render_mesh --input_path {path-to-mp4-stick-animation-file}
We also provide a Blender script in render/render.blend to render the generated SMPL mesh.
Citation
If you find our project helpful, please cite our work
@inproceedings{zhai2023language,
title={Language-guided Human Motion Synthesis with Atomic Actions},
author={Zhai, Yuanhao and Huang, Mingzhen and Luan, Tianyu and Dong, Lu and Nwogu, Ifeoma and Lyu, Siwei and Doermann, David and Yuan, Junsong},
booktitle={Proceedings of the 31st ACM International Conference on Multimedia},
pages={5262--5271},
year={2023}
}
Acknowledgement
This project is developed upon MDM: Human Motion Diffusion Models, thanks to their great work!