ModelZoo.md
February 9, 2026 ยท View on GitHub
General Pretrained Models
| Model | Description | Input Images | Action Dim | Model Size | Link |
|---|---|---|---|---|---|
| Dexbotic-Base | Discrete vision-language action model (similar to OpenVLA) | Single View | NA | 7B | ๐ค Hugging Face |
| Dexbotic-CogACT-SArm | Single-arm CogACT model | Single View | 7D | 7B | ๐ค Hugging Face |
| Dexbotic-CogACT-HArm | Dual-arm CogACT model with multiple views input | Main View + Left Hand-View + Right Hand-View | 16D | 7B | ๐ค Hugging Face |
| DM0-base | DM0 base model with Flow Matching action generation | Up to 3 Views | 32D | 2.4B | ๐ค Hugging Face |
It is recommended to download the pretrained models into the following folders.
mkdir checkpoints
cd checkpoints
git clone https://huggingface.co/Dexmal/Dexbotic-Base Dexbotic-Base
git clone https://huggingface.co/Dexmal/Dexbotic-CogACT-SArm Dexbotic-CogACT-SArm
git clone https://huggingface.co/Dexmal/Dexbotic-CogACT-HArm Dexbotic-CogACT-HArm
git clone https://huggingface.co/Dexmal/DM0-base DM0-base
Action Dimension Description
Users need to map their data to the action dimensions of the pretrained models. If the data dimension is smaller than the pretrained model dimension, padding will be conducted automatically.
We recommend using the following data formats to fully utilize the pretrained models:
- Single-arm end-effector pose: Organize 7D action data as
[xyz + rpy + gripper] - Single-arm joint angles: Organize 8D action data as
[joints + gripper] - Dual-arm end-effector pose: Organize 14D action data as
[left_arm_xyz + left_arm_rpy + left_arm_gripper + right_arm_xyz + right_arm_rpy + right_arm_gripper] - Dual-arm joint angles: Organize 16D action data as
[left_arm_joints + left_arm_gripper + right_arm_joints + right_arm_gripper]
Other Dexbotic Models
| Model | Link |
|---|---|
| Dexbotic-ฯ0 | ๐ค HF |
| Dexbotic-ฯ05 | ๐ค HF |
| Dexbotic-NaVILA | ๐ค HF |
| Dexbotic-RL-Base | ๐ค HF |
Benchmark Results
Libero
| Model | Libero-Spatial | Libero-Object | Libero-Goal | Libero-10 | Average | Config | Checkpoint Link |
|---|---|---|---|---|---|---|---|
| CogACT | 97.2 | 98.0 | 90.2 | 88.8 | 93.6 | - | - |
| DB-CogACT | 93.8 | 97.8 | 96.2 | 91.8 | 94.9 | libero_cogact.py | ๐ค HF |
| ฯ0 | 96.8 | 98.8 | 95.8 | 85.2 | 94.2 | - | - |
| DB-ฯ0 | 97 | 98.2 | 94 | 86.4 | 93.9 | libero_pi0.py | ๐ค HF |
| MemVLA | 98.4 | 98.4 | 96.4 | 93.4 | 96.7 | - | |
| DB-MemVLA | 97.2 | 99.2 | 98.4 | 93.2 | 97.0 | libero_memvla.py | ๐ค HF |
CALVIN
Our training and evaluation are conducted under the ABC->D setting.
| Model | 1 | 2 | 3 | 4 | 5 | Average Length | Config | Checkpoint Link |
|---|---|---|---|---|---|---|---|---|
| CogACT | 83.8 | 72.9 | 64 | 55.9 | 48 | 3.246 | - | - |
| DB-CogACT | 93.5 | 86.7 | 80.3 | 76 | 69.8 | 4.063 | calvin_cogact.py | ๐ค HF |
| OFT | 89.1 | 79.4 | 67.4 | 59.8 | 51.5 | 3.472 | - | - |
| DB-OFT | 92.8 | 80.7 | 69.2 | 60.2 | 51.1 | 3.540 | calvin_oft.py | ๐ค HF |
SimplerEnv
Our training uses the Bridge dataset and is tested on the WidowX environment.
| Model | Put Spoon on Towel | Put Carrot on Plate | Stack Green Block on Yellow Block | Put Eggplant in Yellow Basket | Average | Config | Checkpoint Link |
|---|---|---|---|---|---|---|---|
| CogACT | 71.7 | 50.8 | 15 | 67.5 | 51.25 | - | - |
| DB-CogACT | 87.5 | 65.28 | 29.17 | 95.83 | 69.45 | simpler_cogact.py | ๐ค HF |
| OFT | 12.5 | 4.2 | 4.2 | 100 | 30.23 | - | - |
| DB-OFT | 91.67 | 76.39 | 43.06 | 94.44 | 76.39 | simpler_oft.py | ๐ค HF |
| MemVLA | 75.0 | 75.0 | 37.5 | 100.0 | 71.9 | - | - |
| DB-MemVLA | 100.0 | 66.7 | 70.8 | 100.0 | 84.4 | simpler_memvla.py | ๐ค HF |
ManiSkill2
| Model | PickCube | StackCube | PickSingleYCB | PickSingleEGAD | PickClutterYCB | Average | Config | Checkpoint Link |
|---|---|---|---|---|---|---|---|---|
| CogACT | 55 | 70 | 30 | 25 | 20 | 40 | - | - |
| DB-CogACT | 90 | 65 | 65 | 40 | 30 | 58 | maniskill2_cogact.py | ๐ค HF |
| OFT | 40 | 45 | 5 | 5 | 0 | 21 | - | - |
| DB-OFT | 90 | 75 | 55 | 65 | 30 | 63 | maniskill2_oft.py | ๐ค HF |
| ฯ0 | 95 | 85 | 55 | 85 | 10 | 66 | - | - |
| DB-ฯ0 | 95 | 85 | 65 | 50 | 30 | 65 | maniskill2_pi0.py | ๐ค HF |
RoboTwin2.0
Our training uses the RoboTwin2.0 demo_clean dataset and is tested on the Aloha-AgileX demo_clean environment.
| Model | Adjust Bottle | Grab Roller | Place Empty Cup | Place Phone Stand | Average | Config | Checkpoint Link |
|---|---|---|---|---|---|---|---|
| CogACT | 87 | 72 | 11 | 5 | 43.8 | - | - |
| DB-CogACT | 99 | 89 | 28 | 18 | 58.5 | robotwin2_cogact.py | ๐ค HF |