ShareRobot Dataset

April 1, 2025 · View on GitHub

ShareRobot, a high-quality heterogeneous dataset that labels multi-dimensional information, including task planning, object affordance, and end-effector trajectory, effectively enhancing various robotic capabilities.

Overview of ShareRobot

ee709e8b-6f05-428d-abff-2578914aeb0d

For planning, we have 51,403 episodes and each with 30 frames. In the process of data generation, we design 5 different templates for each of the 10 question types in RoboVQA [1]. In the process of data generation, we randomly select 2 templates of each question type to generate question-answer pairs for every instance. This process transforms 51,403 instances into 1,027,990 question-answer pairs, with annotators monitoring data generation to maintain the dataset’s integrity.

For Affordance, we have 6,522 images and each with affordance areas aligned with an instruction.

For Trajectory, we have 6,870 images and each with at least 3 {x, y} coordinates aligned with an instruction.

Data Sources🌍

a608d080-665a-4ab1-bd8f-d5bd121454da

ShareRobot dataset contains 23 original datasets from Open X-Embodiment dataset [2], 12 embodiments and 107 types of atomic tasks.

Raw Dataset for Planning

Raw DatasetNumber of Raws
nyu_door_opening_surprising_effectiveness421
bridge15738
dlr_edan_shared_control_converted_externally_to_rlds63
utokyo_xarm_pick_and_place_converted_externally_to_rlds92
cmu_stretch10
asu_table_top_converted_externally_to_rlds109
dlr_sara_pour_converted_externally_to_rlds51
utokyo_xarm_bimanual_converted_externally_to_rlds27
robo_set18164
dobbe5200
berkeley_autolab_ur5882
qut_dexterous_manpulation192
aloha_mobile264
dlr_sara_grid_clamp_converted_externally_to_rlds40
ucsd_pick_and_place_dataset_converted_externally_to_rlds569
ucsd_kitchen_dataset_converted_externally_to_rlds39
jaco_play956
utokyo_pr2_opening_fridge_converted_externally_to_rlds64
conq_hose_manipulation56
fmb7836
plex_robosuite398
utokyo_pr2_tabletop_manipulation_converted_externally_to_rlds189
viola44

Raw Dataset for Affordance

Raw DatasetNumber of Raws
utokyo_pr2_tabletop_manipulation_converted_externally_to_rlds24
utokyo_xarm_pick_and_place_converted_externally_to_rlds23
ucsd_kitchen_dataset_converted_externally_to_rlds10
ucsd_pick_and_place_dataset_converted_externally_to_rlds112
nyu_door_opening_surprising_effectiveness85
jaco_play171
bridge2610
utokyo_pr2_opening_fridge_converted_externally_to_rlds12
asu_table_top_converted_externally_to_rlds24
viola1
berkeley_autolab_ur5122
aloha_mobile23
conq_hose_manipulation1
dobbe717
fmb561
plex_robosuite13
qut_dexterous_manpulation16
robo_set1979
dlr_edan_shared_control_converted_externally_to_rlds18
Summary6522

Raw Dataset for Trajectory

Raw DatasetNumber of Raws
utokyo_pr2_tabletop_manipulation_converted_externally_to_rlds35
utokyo_xarm_pick_and_place_converted_externally_to_rlds36
ucsd_kitchen_dataset_converted_externally_to_rlds19
dlr_sara_grid_clamp_converted_externally_to_rlds1
ucsd_pick_and_place_dataset_converted_externally_to_rlds109
nyu_door_opening_surprising_effectiveness74
jaco_play175
utokyo_xarm_bimanual_converted_externally_to_rlds7
bridge2986
utokyo_pr2_opening_fridge_converted_externally_to_rlds12
asu_table_top_converted_externally_to_rlds22
berkeley_autolab_ur5164
dobbe759
fmb48
qut_dexterous_manpulation29
robo_set2374
dlr_sara_pour_converted_externally_to_rlds3
dlr_edan_shared_control_converted_externally_to_rlds17
Summary6870

Data Format

Planning

data-demo

{
 "id"{
        "id": "/mnt/hpfs/baaiei/jyShi/rt_frames_success/rtx_frames_success_42/62_robo_set#episode_1570",
        "task": "Future_Prediction_Task",
        "selected_step": 3,
        "conversations": [
            {
                "from": "human",
                "value": "<image 0-25> After <move the grasped banana towards the mug>, what's the most probable next event?"
            },
            {
                "from": "gpt",
                "value": "<place the banana into the mug>"
            }
        ],
        "image": [
            "/path/to/image_0-25"
        ]
    }        
}

     

Affordance

{

        "id": 2486,
        "meta_data": {
            "original_dataset": "bridge",
            "original_width": 640,
            "original_height": 480
        },
        "instruction": "place the red fork to the left of the left burner",
        "affordance": {
            "x": 352.87425387858815,
            "y": 186.47871614766484,
            "width": 19.296008229513156,
            "height": 14.472006172134865
    }

Visualize Code

import json
import os
import cv2
import numpy as np

img_dir = '/path/to/your/original/images/dir'
affordance_json = '/path/to/your/affordances/json'
output_img_dir = '/path/to/your/visualized/images/dir'

with open(affordance_json, 'r') as f:
    data = json.load(f)
    for item in data:
        filepath = os.path.join(img_dir, item['id'])

        image = cv2.imread(filepath)
        color = (255, 0, 0)
        thickness = 2

        x_min,y_min = item['affordance']['x'], item['affordance']['y']
        x_max,y_max = item['affordance']['x']+item['affordance']['width'], item['affordance']['y']+item['affordance']['height']

        # 定义矩形的四个顶点坐标
        pts = np.array([
            [x_min, y_min],  # 左上角
            [x_max, y_min],  # 右上角
            [x_max, y_max],  # 右下角
            [x_min, y_max]   # 左下角
        ], dtype=np.float32)

        # 绘制矩形框
        cv2.polylines(image, [pts.astype(int)], isClosed=True, color=color, thickness=thickness)

        # 获取相对路径并拼接目标路径
        relative_path = os.path.relpath(filepath, img_dir)  # 获取相对于 img_dir 的相对路径
        output_img_path = os.path.join(output_img_dir, relative_path)  # 拼接目标路径

        # 创建目标文件夹
        output_directory = os.path.dirname(output_img_path)
        if not os.path.exists(output_directory):
            os.makedirs(output_directory)

        # 打印调试信息
        print(f"Input filepath: {filepath}")
        print(f"Output image path: {output_img_path}")
        print(f"Output directory: {output_directory}")

        # 保存图像
        cv2.imwrite(output_img_path, image)

Trajectory

{
        "id": 456,
        "meta_data": {
            "original_dataset": "bridge",
            "original_width": 640,
            "original_height": 480
        },
        "instruction": "reach for the carrot",
        "points": [
            [
                265.45454545454544,
                120.0
            ],
            [
                275.1515151515152,
                162.42424242424244
            ],
            [
                280.0,
                213.33333333333331
            ],
            [
                280.0,
                259.3939393939394
            ]
        ]
    },

Visualize Code

import json
import os
from PIL import Image, ImageDraw

trajectory_final = '/path/to/your/trajectory_json'
img_dir = '/path/to/your/original/images/dir'
output_img_dir = '/path/to/your/visualzed/images/dir'

with open(trajectory_final, 'r') as f:
    data = json.load(f)
    for item in data:
        filepath = os.path.join(img_dir, item['id'])
        points = item['points']

        image = Image.open(filepath).convert("RGB")  # 确保图像是 RGB 模式
        draw = ImageDraw.Draw(image)  # 创建绘图对象
        # 定义颜色和线宽
        color = (255, 0, 0)  # 红色 (RGB 格式)
        thickness = 2


        scaled_points = [
                (point[0], point[1])
                for point in points
            ]
        # 按照顺序连接相邻的点
        for i in range(len(scaled_points) - 1):
            draw.line([scaled_points[i], scaled_points[i + 1]], fill=color, width=thickness)

        # 获取相对路径并拼接目标路径
        relative_path = os.path.relpath(filepath, img_dir)
        output_img_path = os.path.join(output_img_dir, relative_path)

        # 创建目标文件夹
        output_directory = os.path.dirname(output_img_path)
        if not os.path.exists(output_directory):
            os.makedirs(output_directory)

        # 打印调试信息
        print(f"Input filepath: {filepath}")
        print(f"Output image path: {output_img_path}")
        print(f"Output directory: {output_directory}")

        # 保存图像
        image.save(output_img_path)

Evaluation🚀

Powered by ShareRobot dataset, RoboBrain Model achieves stunning results.🌟

Task planning capability: The RoboBrain model trained on ShareRobot achieves a 30.2% improvement in task decomposition accuracy (BLEU-4 reached 55.05%), significantly better than existing methods;  

Affordance perception capability: The average accuracy (AP) of object affordance area recognition is 27.1%, which is 14.6% higher than the baseline model.

Trajectory prediction capability: End-effector trajectory prediction error reduced by 42.9% (DFD index decreased from 0.191 to 0.109);     

General capability: In the OpenEQA benchmark, the scene understanding score surpasses general multimodal models such as GPT-4V. The RoboBrain model trained with ShareRobot did not sacrifice its general ability.

evaluation_planning

Reference

[1] Pierre Sermanet, Tianli Ding, Jeffrey Zhao, Fei Xia, Debidatta Dwibedi, Keerthana Gopalakrishnan, Christine Chan,Gabriel Dulac-Arnold, Sharath Maddineni, Nikhil J Joshi,et al. Robovqa: Multimodal long-horizon reasoning forrobotics. In ICRA, pages 645–652, 2024.

[2] Abby O’Neill, Abdul Rehman, Abhinav Gupta, AbhiramMaddukuri, Abhishek Gupta, Abhishek Padalkar, AbrahamLee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, et al.Open x-embodiment: Robotic learning datasets and rt-xmodels. arXiv preprint arXiv:2310.08864, 2023.

Citation

@article{ji2025robobrain,
  title={RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete},
  author={Ji, Yuheng and Tan, Huajie and Shi, Jiayu and Hao, Xiaoshuai and Zhang, Yuan and Zhang, Hengyuan and Wang, Pengwei and Zhao, Mengdi and Mu, Yao and An, Pengju and others},
  journal={arXiv preprint arXiv:2502.21257},
  year={2025}
}