LIBERO Evaluation Guide

September 18, 2025 · View on GitHub

This README provides instructions for evaluating GO-1 model fine-tuned on the LIBERO benchmark. For model fine-tuning details, please refer to our main README.

LIBERO requires a different environment, so we employ the server-client inference for evaluation. The server handles policy inference while the client runs the simulation environment, with communication established through HTTP requests.

Start GO-1 Server

Start the server first:

conda activate go1

python evaluate/deploy.py --model_path /path/to/your/checkpoint --data_stats_path /path/to/your/dataset_stats.json --port <SERVER_PORT>

Start LIBERO Client

The client requires a separate terminal session. We strongly recommend using tmux or screen for this process, as evaluation can take several hours to complete.

  1. Clone LIBERO repo:
cd evaluate/libero
git clone https://github.com/Lifelong-Robot-Learning/LIBERO.git
  1. Create a new conda environment:
conda create -n libero python=3.8.13 -y
conda activate libero
  1. Install the required dependencies as specified:
cd LIBERO
pip install -r requirements.txt
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
pip install -e .

# Additional required packages
pip install json_numpy draccus
  1. Run the evaluation script on client side:
cd ..
bash eval_libero.sh <TASK_SUITE> <SAVE_NAME> <SERVER_IP> <SERVER_PORT>

Arguments:

  • TASK_SUITE - Name of the task suite to evaluate (e.g., libero_spatial, libero_object, libero_goal, libero_10, default: libero_10)
  • SAVE_NAME - Identifier for saving evaluation results
  • SERVER_IP - IP address of the server (default: 127.0.0.1)
  • SERVER_PORT - Port number of the server (default: 9000)

See main.py for more options and details.

Results

We report the performance of GO-1 model and other baselines in the table below. Our model is fine-tuned jointly on four task suites for 50k steps.

ModelLibero SpatialLibero ObjectLibero GoalLibero 10Average
GR00T N194.497.693.090.693.9
π0\pi_096.898.895.885.294.2
GO-1 Air94.096.896.291.294.6
GO-196.297.896.089.294.8