README.md
May 21, 2026 · View on GitHub
🍬 SUGAR: A Scalable Human-Video-Driven Generalizable Humanoid Loco-Manipulation Learning Framework
Tianshu Wu1* · Xiangqi Kong2* · Yue Chen1* · Qize Yu1 · Hang Ye1 · Jia Li1 · Yizhou Wang1 · Hao Dong1,✉
1 Center on Frontiers of Computing Studies, School of Computer Science, Peking University
2 School of Computer Science and Engineering, Beihang University
* Equal Contribution ✉ Corresponding Author
📌 Overview
SUGAR is a scalable humanoid loco-manipulation project built upon the IsaacLab manager-based framework. Given third-person videos of human-object interactions, it learns generalizable and deployable humanoid autonomous policies, enabling humanoid robots to solve challenging loco-manipulation tasks in the real-world.
📌 Installation
- clone repository and create a conda environment
git clone https://github.com/tianshuwu/SUGAR.git
cd SUGAR
conda create -n sugar python=3.11
conda activate sugar
- install isaacsim
pip install isaacsim[all,extscache]==5.1.0 --extra-index-url https://pypi.nvidia.com
- install isaaclab
cd ..
git clone git@github.com:isaac-sim/IsaacLab.git
cd IsaacLab
git checkout v2.3.0
pip install flatdict==4.0.1 --no-build-isolation
./isaaclab.sh --install rsl_rl
- install sugar
cd ../SUGAR
pip install -e source/sugar_rl
pip install -e source/sugar_il
# for 5090: pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128
- download data
pip install -U gdown
python -m gdown 1AIJWqS5rFGl5u2Qq6jCCTHKdh51SX2Sc # 400MB
unzip data.zip
rm data.zip
python -m gdown 1wXNAjNMrfV0e-d2pQ6m9dm4xrG5lSoyD # 50MB
unzip descriptions.zip
rm descriptions.zip
python -m gdown 1Uc2SPPVvTboEgw4Scyuz3TmzNKDg-dx- # 250MB
unzip demo_ckpts.zip
rm demo_ckpts.zip
📌 Run SUGAR
Inference
# Optional task: CarryBox, KickBox, PushBox, SitChair, StandBottle, PickBottle
# bash inference.sh TASK_NAME (optional TRACKER_CKPT) (optional GENERATOR_CKPT)
bash inference.sh CarryBox
Train
# Optional task: CarryBox, KickBox, PushBox, SitChair, StandBottle, PickBottle
# bash train.sh TASK_NAME (optional EXP_NAME)
bash train.sh CarryBox
📌 TODO List
- Release inference demo and checkpoints
- Release the complete training pipeline, including refiner, tracker, and generator
- Release processed data for all six tasks
- Release the data processing pipeline from RGB-D human videos to training data
- Release the sim-to-sim pipeline
📌 Acknowledgements
This code implementation is based on these excellent open-source projects, thanks to:
- unitree_rl_lab & beyondmimic: Serves as the codebase for
sugar_rl. - dexgraspvla: Serves as the codebase for
sugar il.
📌 Citation
@article{wu2026sugar,
title={SUGAR: A Scalable Human-Video-Driven Generalizable Humanoid Loco-Manipulation Learning Framework},
author={Wu, Tianshu and Kong, Xiangqi and Chen, Yue and Yu, Qize and Ye, Hang and Li, Jia and Wang, Yizhou and Dong, Hao},
journal={arXiv preprint arXiv:2605.20373},
year={2026}
}