ππInfiniteDance: Scalable 3D Dance Generation Towards in-the-wild Generalizationππ
May 11, 2026 Β· View on GitHub
Status: π§ Repository under active development. We are continuously adding more data and features. More data and features are coming soon!
π Overview
InfiniteDance is a comprehensive framework for scalable 3D music-to-dance generation, designed for high-quality generalization in-the-wild.
π Repository Structure
InfiniteDance
βββ All_LargeDanceAR/ # Main generation module
βββ DanceVQVAE/ # VQ-VAE for motion quantization (follows MoMask)
βββ InfiniteDanceData/ # Dataset directory (Should be placed at root)
βββ dance/ # Motion tokens (.npy)
βββ music/ # Music features (.npy)
βββ partition/ # Data splits (train/val/test)
βββ styles/ # Style metadata
βοΈ Installation
# Clone the repository
git clone git@github.com:MotrixLab/InfiniteDance.git
cd InfiniteDance
# Install dependencies
pip install -r requirements.txt
π₯ Downloads (Data & Weights)
All datasets and pre-trained checkpoints are hosted on Hugging Face. After download, place them in the following locations (relative to the repo root unless you use absolute paths):
π€ Hugging Face CheckPoints: InfiniteDance
1. Data Setup
Download the InfiniteDanceData folder and place it in the repo root:
# Path: <your path>/InfiniteDance_opensource/InfiniteDanceData
2. Model Weights Setup
Please place the downloaded weights in their respective directories:
- VQ-VAE Weights:
All_LargeDanceAR/models/checkpoints/dance_vqvae.pth - InfiniteDance Fine-tuned Weights:
All_LargeDanceAR/output/exp_m2d_infinitedance/best_model_stage2.pt - Base LLM: The released checkpoint already contains the full LLaMA-3.2-1B backbone weights, so you do not need to download anything from Meta. We ship the architecture
config.jsoninAll_LargeDanceAR/models/Llama3.2-1B/.
After placement, the expected structure looks like this:
InfiniteDanceβββ InfiniteDanceData/
β βββ dance/
β βββ music/
β βββ partition/
β βββ styles/
βββ All_LargeDanceAR/
βββ models/
β βββ checkpoints/
β βββ Llama3.2-1B/
βββ RetrievalNet/
β βββ checkpoints/
βββ output/
βββ exp_m2d_infinitedance/
π Usage
1. Inference & Reproduction
The model takes per-frame MuQ embeddings as input ((T, 1024) float32
.npy, ~30 frames per second). Two ways to provide them:
-
Use the released test set β download
muq_features_test_infinitedance.tar.gzfrom Hugging Face and extract it; this is whatinfer.shdefaults to. -
Use your own audio β convert wav / mp3 to MuQ embeddings first:
cd All_LargeDanceAR python utils/extract_muq.py \ --in_dir /path/to/your_audio_dir \ --out_dir ../InfiniteDanceData/music/muq_features/my_songsThen point
infer.shat the new directory:MUSIC_PATH=../InfiniteDanceData/music/muq_features/my_songs bash infer.sh
You can run the full inference pipeline (Generation β Post-processing β Visualization) using the provided shell script or by running the python scripts manually.
Option A: Quick Start (Recommended)
infer.sh runs Inference β tokens-to-SMPL β optional rendering, with
anti-collapse decoding enabled by default.
cd All_LargeDanceAR
DATA_ROOT=../InfiniteDanceData \
CHECKPOINT_PATH=./output/exp_m2d_infinitedance/best_model_stage2.pt \
bash infer.sh
Common overrides: GPU_ID, PROCESSES_PER_GPU, STYLE, MUSIC_LENGTH,
DANCE_LENGTH, TEMPERATURE, TOP_K, TOP_P, SEED. Anti-collapse
decoding is enabled by default; see the comments at the top of infer.sh
to tune it.
Option B: Manual Execution
cd All_LargeDanceAR
python infer_llama_infinitedance.py \
--music_path ../InfiniteDanceData/music/muq_features/test_infinitedance \
--checkpoint_path ./output/exp_m2d_infinitedance/best_model_stage2.pt \
--vqvae_checkpoint_path ./models/checkpoints/dance_vqvae.pth \
--output_dir ./infer_results \
--style Popular --music_length 320 --dance_length 288 \
--temperature 0.8 --top_k 15 --top_p 0.95 --seed 42
Visualization Pipeline: If you ran the manual inference above, proceed to visualize the results:
# 1. Convert tokens to SMPL joints (.npy)
python ./utils/tokens2smpl.py --npy_dir ./infer_results/dance
# 2. Render joints to video (.mp4)
python ./visualization/render_plot_npy.py --joints_dir ./infer_results/dance/npy/joints
1.1 Metrics
metrics.sh runs FID-k / FID-m / Div-k / Div-m and the official Beat-Align score.
cd All_LargeDanceAR
bash metrics.sh <pred_root> [device_id]
# pred_root e.g. ./infer/dance_<TS>/dance/npy/joints
2. Training
Two-stage training (stage 1: bridges + adapters, LLM frozen; stage 2: full fine-tune)
is run via DDP. Edit train.sh (or pass env vars) and launch:
cd All_LargeDanceAR
# Default: 4 GPUs, bf16, with regularization (weight_decay=0.10,
# llama_dropout=0.15, cond_drop_prob=0.15)
DATA_ROOT=../InfiniteDanceData bash train.sh
# Other GPU counts
GPUS=0,1 WS=2 DATA_ROOT=../InfiniteDanceData bash train.sh
# Warm-start from a previous stage-2 checkpoint
PREV_CKPT=./output/m2d_llama/<run>/epoch_X_stage2.pt bash train.sh
π Citation
If you use this code or dataset in your research, please cite our work:
@misc{li2026infinitedancescalable3ddance,
title={InfiniteDance: Scalable 3D Dance Generation Towards in-the-wild Generalization},
author={Ronghui Li and Zhongyuan Hu and Li Siyao and Youliang Zhang and Haozhe Xie and Mingyuan Zhang and Jie Guo and Xiu Li and Ziwei Liu},
year={2026},
eprint={2603.13375},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2603.13375},
}