LeRobot Tutorial with MuJoCo

July 6, 2025 ยท View on GitHub

This repository contains minimal examples for collecting demonstration data and training (or fine-tuning) vision language action models on custom datasets.

Table of Contents

Installation

We have tested our environment on python 3.10.

I do not recommend installing lerobot package with pip install lerobot. This causes errors.

Install mujoco package dependencies and lerobot

pip install -r requirements.txt

Make sure your mujoco version is 3.1.6.

Unzip the asset

cd asset/objaverse
unzip plate_11.zip

Updates & Plans

:white_check_mark: Viewer Update.

:white_check_mark: Add different mugs, plates for different language instructions.

:white_check_mark: Add pi_0 training and inference.

:white_check_mark: Add SmolVLA

1. Collect Demonstration Data

Run 1.collect_data.ipynb

Collect demonstration data for the given environment. The task is to pick a mug and place it on the plate. The environment recognizes the success if the mug is on the plate, gthe ripper opened, and the end-effector positioned above the mug.

Use WASD for the xy plane, RF for the z-axis, QE for tilt, and ARROWs for the rest of rthe otations.

SPACEBAR will change your gripper's state, and Z key will reset your environment with discarding the current episode data.

For overlayed images,

  • Top Right: Agent View
  • Bottom Right: Egocentric View
  • Top Left: Left Side View
  • Bottom Left: Top View

The dataset is contained as follows:

fps = 20,
features={
    "observation.image": {
        "dtype": "image",
        "shape": (256, 256, 3),
        "names": ["height", "width", "channels"],
    },
    "observation.wrist_image": {
        "dtype": "image",
        "shape": (256, 256, 3),
        "names": ["height", "width", "channel"],
    },
    "observation.state": {
        "dtype": "float32",
        "shape": (6,),
        "names": ["state"], # x, y, z, roll, pitch, yaw
    },
    "action": {
        "dtype": "float32",
        "shape": (7,),
        "names": ["action"], # 6 joint angles and 1 gripper
    },
    "obj_init": {
        "dtype": "float32",
        "shape": (6,),
        "names": ["obj_init"], # just the initial position of the object. Not used in training.
    },
},

This will make the dataset on './demo_data' folder, which will look like this,

.
โ”œโ”€โ”€ data
โ”‚   โ”œโ”€โ”€ chunk-000
โ”‚   โ”‚   โ”œโ”€โ”€ episode_000000.parquet
โ”‚   โ”‚   โ””โ”€โ”€ ...
โ”œโ”€โ”€ meta
โ”‚   โ”œโ”€โ”€ episodes.jsonl
โ”‚   โ”œโ”€โ”€ info.json
โ”‚   โ”œโ”€โ”€ stats.json
โ”‚   โ””โ”€โ”€ tasks.jsonl
โ””โ”€โ”€ 

For convenience, we have added Example Data to the repository.

2. Playback Your Data

Run 2.visualize_data.ipynb

Visualize your action based on the reconstructed simulation scene.

The main simulation is replaying the action.

The overlayed images on the top right and bottom right are from the dataset.

3. Train Action-Chunking-Transformer (ACT)

Run 3.train.ipynb

This takes around 30~60 mins.

Train the ACT model on your custom dataset. In this example, we set chunk_size as 10.

The trained checkpoint will be saved in './ckpt/act_y' folder.

To evaluate the policy on the dataset, you can calculate the error between ground-truth actions from the dataset.

PicklingError: Can't pickle at 0x131d1bd00>: attribute lookup on __main__ failed If you have a pickling error,
PicklingError: Can't pickle <function <lambda> at 0x131d1bd00>: attribute lookup <lambda> on __main__ failed

Please set your num_workers as 0, like,

dataloader = torch.utils.data.DataLoader(
    dataset,
    num_workers=0, # 4
    batch_size=64,
    shuffle=True,
    pin_memory=device.type != "cpu",
    drop_last=True,
)

4. Deploy your Policy

Run 4.deploy.ipynb

You can download checkpoint from google drive if you don't have gpu to train your model.

Deploy trained policy in simulation.

5-6. Collect data and visualize in lanugage conditioned environment

Environment

Data

Models and Dataset ๐Ÿค—

Model ๐Ÿค— Dataset ๐Ÿค—
pi_0 finetuned dataset
smolvla finetuned same dataset

7. Train and Deploy pi_0

Training Scripts

python train_model.py --config_path pi0_omy.yaml

Rollout of trained policy

Train logs

Configuration File

dataset:
  repo_id: omy_pnp_language # Repository ID
  root: ./demo_data_language # Your root for data file!
policy:
  type : pi0
  chunk_size: 5
  n_action_steps: 5
  
save_checkpoint: true
output_dir: ./ckpt/pi0_omy <- Save directory
batch_size: 16
job_name : pi0_omy
resume: false 
seed : 42
num_workers: 8
steps: 20_000
eval_freq: -1 # No evaluation
log_freq: 50
save_checkpoint: true
save_freq: 10_000
use_policy_training_preset: true
  
wandb:
  enable: true
  project: pi0_omy
  entity: <your_wandb_entity>
  disable_artifact: true

8. Train and Deploy Smolvla

Training Scripts

python train_model.py --config_path smolvla_omy.yaml

Rollout of trained policy

Train logs

Configuration File

dataset:
  repo_id: omy_pnp_language # Repository ID
  root: ./demo_data_language # Your root for data file!
policy:
  type : smolvla
  chunk_size: 5
  n_action_steps: 5
  device: cuda
  
save_checkpoint: true
output_dir: ./ckpt/smolvla_omy # Save directory
batch_size: 16
job_name : smolvla_omy
resume: false 
seed : 42
num_workers: 8
steps: 20_000
eval_freq: -1 # No evaluation
log_freq: 50
save_checkpoint: true
save_freq: 10_000
use_policy_training_preset: true
  
wandb:
  enable: true
  project: smolvla_omy
  entity: <your_wandb_entity>
  disable_artifact: true

Acknowledgements