Tutorial.md
June 22, 2026 ยท View on GitHub
Installation
๐ณ Docker (Recommended)
We strongly recommend using the docker as a unified, consistent, and reproducible environment for training and deployment. This approach not only ensures reliability across workflows but also minimizes potential issues arising from CUDA version differences and Python dependency conflicts.
Please see the
Dockerfilefor details about the image contents.
- Prerequisites
-
Ubuntu 20.04 or 22.04
-
NVIDIA GPU: RTX 4090 / RTX 5090 / A100 / H100 (8 GPUs recommended for training; 1 GPU for deployment)
-
NVIDIA Docker installed
- Step 1: Clone the Repository
git clone https://github.com/Dexmal/dexbotic.git
- Step 2: Start Docker
docker run -it --rm --gpus all --network host \
-v /path/to/dexbotic:/dexbotic \
dexmal/dexbotic \
bash
- Step 3: Activate Dexbotic Environment
cd /dexbotic
conda activate dexbotic
pip install -e .
The image built from this repo's Dockerfile ships with torch==2.6.2 and transformers==4.57.6.
Using on Blackwell GPUs
For users with Blackwell GPUs (e.g., B100, RTX 5090), please use the specialized Docker image dexmal/dexbotic:c130t28.
Step 1: Start Docker with Blackwell Image
docker run -it --rm --gpus all --network host \
-v /path/to/dexbotic:/dexbotic \
dexmal/dexbotic:c130t28 \
bash
Step 2: Activate Environment
cd /dexbotic
pip install -e .
Conda Installation
- Prerequisites
-
Ubuntu 20.04 or 22.04
-
NVIDIA GPU: RTX 4090 / A100 / H100 (8 GPUs recommended for training; 1 GPU for deployment)
-
CUDA 11.8 (tested; other versions may also work)
-
Anaconda
- Step 1: Clone the Repository
git clone https://github.com/Dexmal/dexbotic.git
- Step 2: Install Dependencies
conda create -n dexbotic python=3.10 -y
conda activate dexbotic
pip install torch==2.6.0 torchvision==0.21.0 xformers --index-url https://download.pytorch.org/whl/cu118
cd dexbotic
pip install -e .
pip install transformers==4.57.6
# FlashAttention kernels (e.g. cross-entropy used in RL training) are fetched
# on demand from the Hugging Face Hub via the `kernels` library, which is
# installed as a core dependency above. No local flash-attn build is required.
#
# Optionally, to use a locally compiled flash-attn (e.g. for the
# `flash_attention_2` HF attention implementation), install it explicitly:
# pip install ninja packaging
# pip install flash-attn --no-build-isolation
Evaluation
We provide pre-trained models for both simulation benchmarks and real-robot settings. Here we use the Libero pre-trained model as an example.
First, you should download the pre-trained models and put it in the checkpoints folder.
mkdir -p checkpoints/libero
cd checkpoints/libero
git clone https://huggingface.co/Dexmal/libero-db-cogact libero_cogact
We will demonstrate two ways to evaluate the model. The first is to directly infer one sample, which is the quick way to experience the model. The other is to first deploy the model server and then use a client to get the results, which is more practical in real-world deployment.
Inference One Sample
CUDA_VISIBLE_DEVICES=0 python playground/benchmarks/libero/libero_cogact.py --task inference_single --image_path test_data/libero_test.png --prompt 'What action should the robot take to put both moka pots on the stove?'
You will expect the model to output a set of actions.
Deploy Mode
- Start Inference Server
CUDA_VISIBLE_DEVICES=0 python playground/benchmarks/libero/libero_cogact.py --task inference
- Test Model Inference Results
curl -X POST \
-F "text=What action should the robot take to put both moka pots on the stove?" \
-F "image=@test_data/libero_test.png" \
http://localhost:7891/process_frame
- Test Libero Benchmark with Dexbotic-Benchmark
Set up the dexbotic-benchmark following its instructions and test the deployed model in the LIBERO-GOAL environment.
cd dexbotic-benchmark
docker run --gpus all --network host -v $(pwd):/workspace \
dexmal/dexbotic_benchmark \
bash /workspace/scripts/env_sh/libero.sh /workspace/evaluation/configs/libero/example_libero.yaml
dexbotic-benchmark also works without docker, see its documentation for further support
Training
Before starting training, please follow the instructions in ModelZoo.md to set up the pre-trained models, and download the Libero dataset as described in docs/Data.md.
Training a Model with Provided Data
We use Libero as an example to demonstrate how to train a model with Dexbotic.
The experiment configuration file for this example is located at: playground/benchmarks/libero/libero_cogact.py
- Experiment Configuration
# LiberoCogActTrainerConfig
output_dir = [Path to save checkpoints]
- Launch Training
torchrun --nproc_per_node=8 playground/benchmarks/libero/libero_cogact.py
We recommend using 8 ร NVIDIA A100/H100 GPUs for training. If you are using 8 ร RTX 4090, please use the configuration file
scripts/deepspeed/zero3_offload.jsonto reduce GPU memory utilization. For FSDP2 support, see FSDP2.md.
Training a Model with Your Own Data
- Prepare Your Own Data
Refer to docs/Data.md for detailed instructions on data preparation.
Once created, register your dataset under dexbotic/data/data_source.
- Experiment Configuration
Create a new experiment configuration file (based on playground/example_exp.py) and set the required keys:
# CogActTrainerConfig
output_dir = [Path to save checkpoints]
# CogActDataConfig
dataset_name = [Name of your registered dataset]
- Launch Training
torchrun --nproc_per_node=8 playground/benchmarks/example_exp.py
After training, please refer to the Evaluation section above to evaluate your model. Update the model_name_or_path in the inference config to your trained checkpoint, and run inference or start the inference server as described.