README.md

May 21, 2026 · View on GitHub

🍬 SUGAR: A Scalable Human-Video-Driven Generalizable Humanoid Loco-Manipulation Learning Framework

Tianshu Wu1* · Xiangqi Kong2* · Yue Chen1* · Qize Yu1 · Hang Ye1 · Jia Li1 · Yizhou Wang1 · Hao Dong1,✉

1 Center on Frontiers of Computing Studies, School of Computer Science, Peking University
2 School of Computer Science and Engineering, Beihang University

* Equal Contribution     Corresponding Author

📌 Overview

SUGAR is a scalable humanoid loco-manipulation project built upon the IsaacLab manager-based framework. Given third-person videos of human-object interactions, it learns generalizable and deployable humanoid autonomous policies, enabling humanoid robots to solve challenging loco-manipulation tasks in the real-world.

📌 Installation

  1. clone repository and create a conda environment
git clone https://github.com/tianshuwu/SUGAR.git
cd SUGAR
conda create -n sugar python=3.11
conda activate sugar
  1. install isaacsim
pip install isaacsim[all,extscache]==5.1.0 --extra-index-url https://pypi.nvidia.com
  1. install isaaclab
cd ..
git clone git@github.com:isaac-sim/IsaacLab.git
cd IsaacLab
git checkout v2.3.0
pip install flatdict==4.0.1 --no-build-isolation
./isaaclab.sh --install rsl_rl
  1. install sugar
cd ../SUGAR
pip install -e source/sugar_rl
pip install -e source/sugar_il
# for 5090:  pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128
  1. download data
pip install -U gdown
python -m gdown 1AIJWqS5rFGl5u2Qq6jCCTHKdh51SX2Sc     # 400MB
unzip data.zip
rm data.zip
python -m gdown 1wXNAjNMrfV0e-d2pQ6m9dm4xrG5lSoyD     # 50MB
unzip descriptions.zip
rm descriptions.zip
python -m gdown 1Uc2SPPVvTboEgw4Scyuz3TmzNKDg-dx-     # 250MB
unzip demo_ckpts.zip
rm demo_ckpts.zip

📌 Run SUGAR

Inference

# Optional task: CarryBox, KickBox, PushBox, SitChair, StandBottle, PickBottle
# bash inference.sh TASK_NAME (optional TRACKER_CKPT) (optional GENERATOR_CKPT)
bash inference.sh CarryBox 

Train

# Optional task: CarryBox, KickBox, PushBox, SitChair, StandBottle, PickBottle
# bash train.sh TASK_NAME (optional EXP_NAME)
bash train.sh CarryBox  

📌 TODO List

  • Release inference demo and checkpoints
  • Release the complete training pipeline, including refiner, tracker, and generator
  • Release processed data for all six tasks
  • Release the data processing pipeline from RGB-D human videos to training data
  • Release the sim-to-sim pipeline

📌 Acknowledgements

This code implementation is based on these excellent open-source projects, thanks to:

📌 Citation

@article{wu2026sugar,
  title={SUGAR: A Scalable Human-Video-Driven Generalizable Humanoid Loco-Manipulation Learning Framework},
  author={Wu, Tianshu and Kong, Xiangqi and Chen, Yue and Yu, Qize and Ye, Hang and Li, Jia and Wang, Yizhou and Dong, Hao},
  journal={arXiv preprint arXiv:2605.20373},
  year={2026}
}