LeRobot Tutorial with MuJoCo
July 6, 2025 ยท View on GitHub
This repository contains minimal examples for collecting demonstration data and training (or fine-tuning) vision language action models on custom datasets.
Table of Contents
- :pencil: Installation
- :mega: Updates and Plans
- :video_game: 1. Collect Demonstration Data
- :movie_camera: 2. Playback Your Data
- :fire: 3. Train Action-Chunking-Transformer (ACT)
- :pushpin: 4. Deploy ACT
- :floppy_disk: 5-6. Language conditioned Environment.
- ๐ค Models and Dataset
- :zap:7.Train and deploy pi_0
- :bulb:8.Train and deploy smolvla
- :pencil: Acknowledgements
Installation
We have tested our environment on python 3.10.
I do not recommend installing lerobot package with pip install lerobot. This causes errors.
Install mujoco package dependencies and lerobot
pip install -r requirements.txt
Make sure your mujoco version is 3.1.6.
Unzip the asset
cd asset/objaverse
unzip plate_11.zip
Updates & Plans
:white_check_mark: Viewer Update.
:white_check_mark: Add different mugs, plates for different language instructions.
:white_check_mark: Add pi_0 training and inference.
:white_check_mark: Add SmolVLA
1. Collect Demonstration Data
Collect demonstration data for the given environment. The task is to pick a mug and place it on the plate. The environment recognizes the success if the mug is on the plate, gthe ripper opened, and the end-effector positioned above the mug.
Use WASD for the xy plane, RF for the z-axis, QE for tilt, and ARROWs for the rest of rthe otations.
SPACEBAR will change your gripper's state, and Z key will reset your environment with discarding the current episode data.
For overlayed images,
- Top Right: Agent View
- Bottom Right: Egocentric View
- Top Left: Left Side View
- Bottom Left: Top View
The dataset is contained as follows:
fps = 20,
features={
"observation.image": {
"dtype": "image",
"shape": (256, 256, 3),
"names": ["height", "width", "channels"],
},
"observation.wrist_image": {
"dtype": "image",
"shape": (256, 256, 3),
"names": ["height", "width", "channel"],
},
"observation.state": {
"dtype": "float32",
"shape": (6,),
"names": ["state"], # x, y, z, roll, pitch, yaw
},
"action": {
"dtype": "float32",
"shape": (7,),
"names": ["action"], # 6 joint angles and 1 gripper
},
"obj_init": {
"dtype": "float32",
"shape": (6,),
"names": ["obj_init"], # just the initial position of the object. Not used in training.
},
},
This will make the dataset on './demo_data' folder, which will look like this,
.
โโโ data
โ โโโ chunk-000
โ โ โโโ episode_000000.parquet
โ โ โโโ ...
โโโ meta
โ โโโ episodes.jsonl
โ โโโ info.json
โ โโโ stats.json
โ โโโ tasks.jsonl
โโโ
For convenience, we have added Example Data to the repository.
2. Playback Your Data

Visualize your action based on the reconstructed simulation scene.
The main simulation is replaying the action.
The overlayed images on the top right and bottom right are from the dataset.
3. Train Action-Chunking-Transformer (ACT)
Run 3.train.ipynb
This takes around 30~60 mins.
Train the ACT model on your custom dataset. In this example, we set chunk_size as 10.
The trained checkpoint will be saved in './ckpt/act_y' folder.
To evaluate the policy on the dataset, you can calculate the error between ground-truth actions from the dataset.
PicklingError: Can't pickle at 0x131d1bd00>: attribute lookup on __main__ failed
If you have a pickling error,
PicklingError: Can't pickle <function <lambda> at 0x131d1bd00>: attribute lookup <lambda> on __main__ failed
Please set your num_workers as 0, like,
dataloader = torch.utils.data.DataLoader(
dataset,
num_workers=0, # 4
batch_size=64,
shuffle=True,
pin_memory=device.type != "cpu",
drop_last=True,
)
4. Deploy your Policy
Run 4.deploy.ipynb
You can download checkpoint from google drive if you don't have gpu to train your model.

Deploy trained policy in simulation.
5-6. Collect data and visualize in lanugage conditioned environment
- 5.language_env.ipynb: Collect Dataset with keyboard teleoperation. The command is same as first environment.
- 6.visualize_data.ipynb: Visualize Collected Data
Environment
Data

Models and Dataset ๐ค
| Model ๐ค | Dataset ๐ค |
|---|---|
| pi_0 finetuned | dataset |
| smolvla finetuned | same dataset |
7. Train and Deploy pi_0
- train_model.py: Training script
- pi0_omy.yaml: Training configuration file
- 7.pi0.ipynb: Policy deployment
Training Scripts
python train_model.py --config_path pi0_omy.yaml
Rollout of trained policy

Train logs
Configuration File
dataset:
repo_id: omy_pnp_language # Repository ID
root: ./demo_data_language # Your root for data file!
policy:
type : pi0
chunk_size: 5
n_action_steps: 5
save_checkpoint: true
output_dir: ./ckpt/pi0_omy <- Save directory
batch_size: 16
job_name : pi0_omy
resume: false
seed : 42
num_workers: 8
steps: 20_000
eval_freq: -1 # No evaluation
log_freq: 50
save_checkpoint: true
save_freq: 10_000
use_policy_training_preset: true
wandb:
enable: true
project: pi0_omy
entity: <your_wandb_entity>
disable_artifact: true
8. Train and Deploy Smolvla
- train_model.py: Training script
- smolvla_omy.yaml: Training configuration file
- 8.smolvla.ipynb: Policy deployment
Training Scripts
python train_model.py --config_path smolvla_omy.yaml
Rollout of trained policy

Train logs
Configuration File
dataset:
repo_id: omy_pnp_language # Repository ID
root: ./demo_data_language # Your root for data file!
policy:
type : smolvla
chunk_size: 5
n_action_steps: 5
device: cuda
save_checkpoint: true
output_dir: ./ckpt/smolvla_omy # Save directory
batch_size: 16
job_name : smolvla_omy
resume: false
seed : 42
num_workers: 8
steps: 20_000
eval_freq: -1 # No evaluation
log_freq: 50
save_checkpoint: true
save_freq: 10_000
use_policy_training_preset: true
wandb:
enable: true
project: smolvla_omy
entity: <your_wandb_entity>
disable_artifact: true
Acknowledgements
- The asset for the robotis-omy manipulator is from robotis_mujoco_menagerie.
- The MuJoco Parser Class is modified from yet-another-mujoco-tutorial.
- We refer to original tutorials from lerobot examples.
- The assets for plate and mug is from Objaverse.