README.md

November 7, 2023 · View on GitHub

SemanticBoost: Elevating Motion Generation with Augmented Textual Cues

Xin He Shaoli Huang^* Xiaohang Zhan

Chao Weng Ying Shan

💡 Highlights

SemanticBoost framework consists of optimized diffusion model CAMD and Semantic Enhancement Module which describe specific body parts explicitly. With two modules, SemanticBoost can:

Synthesize more smooth and stable motion sequences.

Understand longer and more complex sentences.

Control specific body parts precisely

⚙ Applications

In this repo, we achieves the functions:

Export 3D joints

Export SMPL representation

Render with TADA 3D roles

📰 Introduction of SemanticBoost

Optimized Diffusion Model

Semantic Enhancement Module

Comparison with SOTA

📢 News

[2023/10/20] Release pretrained weights and inference process

[2023/10/27] Release new pretrained weights and tensorRT speedup

[2023/11/01] Release paper on Arxiv

⚡️ Quick Start

Environment and Weights

1. Dependencies

##### create new environment for conda conda create -n boost python==3.9.15 #### install dependencies conda activate boost pip install -r requirements.txt

2. Linux Package - Debian (EGL package for render)

sudo apt-get install freeglut3-dev

3. Pretrained Weights

bash scripts/prepare.sh

4. (Optional) TADA Support

Download Choice 1

Download charactors in

https://drive.google.com/file/d/1rbkIpRmvPaVD9AJeCxWqBBYHkRIwrNmC/view

Download Init Pose in

https://tada.is.tue.mpg.de/download.php

Save two zip files in the root dir and then run command

bash scripts/tada_process.sh

Download Choice 2

bash scripts/tada_goole.sh

5. (Optional) TensorRT Inference

Download TensorRT SDK, we test with TensorRT-8.6.0 and pytorch 2.0.1

https://developer.nvidia.com/nvidia-tensorrt-8x-download

Set environment

export LD_LIBRARY_PATH=/data/TensorRT-8.6.0.12/lib:$LD_LIBRARY_PATH export PATH=/data/TensorRT-8.6.0.12/bin:$PATH

Install python api

pip install /data/TensorRT-8.6.0.12/python/tensorrt-8.6.0-cp39-none-linux_x86_64.whl

Export TensorRT engine

bash scripts/quanti.sh

6. (Optional) Download Blender 2.93 to export fbx file

bash scripts/blender_prepare.sh

👀 Demo

Webui or HuggingFace

Run the following script to launch webui, then visit 0.0.0.0:7860

python app.py

Inference and Visualization

Parameters for inference

''' --prompt Input textual description for generation --mode Which model to generate motion sequences --render Render mode [3dslow, 3dfast, joints] --size The resolution of output video --role TADA role name, default is None --length The total frames of output video, fps=20 -f --follow If the camera follow the motion process during render. -e --export If export fbx file which will cost more time. '''

General Visualization

python inference.py --prompt "A person walks forward and sits down on the chair." --length "120" --mode ncamd --size 1024 --render "3dslow" -f -e

TADA Visualization

######## More tada_role please refer to TADA-100 python inference.py --prompt "A person walks forward and sits down on the chair." --mode ncamd --size 1024 --render "3dslow" --role "Iron Man" --length "120"

Long Motion Synthesis by DoubleTake

python inference.py --prompt "A person walks forward.| A person dances in place.| A person walks backwards." --mode ncamd --size 1024 --render "3dslow" --length "120|100|120"

📖 Citation

If you find our code or paper helps, please consider citing:

@misc{he2023semanticboost, title={SemanticBoost: Elevating Motion Generation with Augmented Textual Cues}, author={Xin He and Shaoli Huang and Xiaohang Zhan and Chao Wen and Ying Shan}, year={2023}, eprint={2310.20323}, archivePrefix={arXiv}, primaryClass={cs.CV} }

Acknowledgments

Thanks to MDM, T2M-GPT, MLD, HumanML3D, joints2smpl and TADA, our code is partially borrowing from them.