Prepare the environment for evaluation

April 3, 2026 · View on GitHub

In this document, we outline the steps to set up the evaluation environment for our four tasks. Because we use two simulators for different tasks (Habitat‑sim for AR, ImageNav (IGNav), and AEQA; RLBench for Manipulation), we provide separate instructions for each.

Environment for Habitat-sim

For Habitat‑sim installation, you can follow the official Habitat‑sim documentation, or follow the steps below.

Install Habitat‑sim v0.2.5

# Create and activate a clean env (for Habitat 0.2.5)
conda create -n habitat025 python=3.9 cmake=3.14 -y
conda activate habitat025

# Install habitat-sim v0.2.5 (Bullet + headless/EGL if running on a server)
conda install -y -c conda-forge -c aihabitat \
  habitat-sim=0.2.5 \
  headless \
  withbullet

Install Habitat‑Lab v0.2.5 and other Python dependencies

The pins below match the Habitat 0.2.5 stack and CUDA 12.1 wheels for PyTorch 2.5.1.

# (from your project root)
mkdir -p src && cd src

#  pip install:
pip install \
  "numpy>=1.20,<1.24" \
  "torch==2.5.1" "torchvision==0.20.1" "torchaudio==2.5.1" \
  --extra-index-url https://download.pytorch.org/whl/cu121

pip install --upgrade \
  "imageio[ffmpeg]" \
  "exceptiongroup>=1.2" \
  "pydantic>=2.7,<3" \
  pandas==2.2.3 \
  jaxtyping==0.2.36 \
  tiktoken==0.9.0 \
  anthropic==0.49.0 \
  json-repair \
  tabulate==0.9.0 \
  tenacity==9.0.0 \
  pyequilib==0.5.8 \
  open3d==0.18.0 \
  openai \
  gdown

# Pin Habitat-Lab and Habitat-Baselines to the commit we used before
pip install -e "git+https://github.com/facebookresearch/habitat-lab.git@094d6be2f9d057e4781a68ae792132895fd4d3d0#egg=habitat_lab&subdirectory=habitat-lab" \
            -e "git+https://github.com/facebookresearch/habitat-lab.git@094d6be2f9d057e4781a68ae792132895fd4d3d0#egg=habitat_baselines&subdirectory=habitat-baselines"

Install the `open-eqa` subtree

cd ..
cd subtrees/open-eqa
pip install -e .

Compatibility note (Habitat‑Baselines)

There is a bug in the latest habitat_baselines that requires adjusting the default cubemap projection size. Modify:

File: src/habitat-baselines/habitat-baselines/habitat_baselines/common/obs_transformers.py

Change line 807:

def get_cubemap_projections(
    img_h: int = 256, img_w: int = 256
) -> List[CameraProjection]:

def get_cubemap_projections(
    img_h: int = 512, img_w: int = 512
) -> List[CameraProjection]:

where 512 matches the default depth image size in this codebase. If you need a different depth resolution or encounter any error, update these values accordingly.

↩︎ Back to Getting Started Checklist

Environment for VLM deployment

We use vLLM to deploy VLMs as policy models for evaluation.

Install vLLM

Note we use Qwen2.5‑VL‑72B‑Instruct‑AWQ as the default VLM in our experiments.

For Qwen2.5‑VL‑72B‑Instruct‑AWQ, we use:

# Create and activate a clean env (for vLLM)
conda create -n vllm python=3.9 -y
conda activate vllm

# Install vLLM and dependencies
pip install vllm==0.7.3 cloudpickle==3.1.1 dill==0.4.0

You can find our full package list at downstream/api_models/env_config/vllm.txt.

For InternVL3‑78B‑AWQ, we use:

# Create and activate a clean env (for vLLM)
conda create -n vllmnew python=3.11 -y
conda activate vllmnew

# Install vLLM and dependencies
pip install vllm==0.9.2 cloudpickle==3.1.1 dill==0.4.0

You can find our full package list at downstream/api_models/env_config/vllmnew.txt.

↩︎ Back to Getting Started Checklist

Environment for SAM2 / Grounding SAM2 deployment

# Create and activate a clean env (for SAM2)
conda create -n sam2 python=3.10 -y
conda activate sam2

Then follow the official installation instructions of SAM2. For the SAM2 environment we used, see downstream/api_models/env_config/sam2.txt for the full package list.

For Grounding SAM2, also install ultralytics in the same sam2 env:

conda activate sam2
pip3 install ultralytics==8.3.118

↩︎ Back to Getting Started Checklist

Environment for WIW-Manipulation

Create conda environment and install PyTorch

cd downstream/world-in-world-manip

conda create -n wow-manip1 python=3.9 -y
conda activate wow-manip1
pip install setuptools==75.6.0 
pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0+cu121 --index-url https://download.pytorch.org/whl/cu121

Install CoppeliaSim for physical simulation

Remember to copy the following export commands to your .bashrc or .zshrc and replace $(pwd) with the absolute path to downstream/world-in-world-manip.

wget https://downloads.coppeliarobotics.com/V4_1_0/CoppeliaSim_Pro_V4_1_0_Ubuntu20_04.tar.xz
tar -xf CoppeliaSim_Pro_V4_1_0_Ubuntu20_04.tar.xz
rm CoppeliaSim_Pro_V4_1_0_Ubuntu20_04.tar.xz
export COPPELIASIM_ROOT=$(pwd)/CoppeliaSim_Pro_V4_1_0_Ubuntu20_04
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$COPPELIASIM_ROOT
export QT_QPA_PLATFORM_PLUGIN_PATH=$COPPELIASIM_ROOT

Install PyRep and AMSolver

You should be able to run evaluations with VLM proposer after this step.

cd wiw_manip/envs
git clone https://github.com/stepjam/PyRep.git
cd PyRep
pip install -r requirements.txt
pip install -e .
cd ..
cp simAddOnScript_PyRep.lua $COPPELIASIM_ROOT
pip install -r requirements.txt
pip install -e .
cd ../..

pip install -r requirements.txt # other required packages

If you meet anything like qt.qpa.plugin: Could not find the Qt platform plugin "xcb", try pip uninstall opencv-python opencv-python-headless and pip install opencv-python-headless==4.11.0.86

Install 3D-Diffuser-Actor (Optional, needed for diff-base and diff-igenex)

mkdir src
cd src
git clone https://github.com/nickgkan/3d_diffuser_actor.git
cd 3d_diffuser_actor
pip install -e .
pip install openai-clip transformers==4.47.1 peft==0.11.1 diffusers==0.11.1 huggingface-hub==0.25.0
conda install dgl=2.4.0 -c dglteam/label/th24_cu121
cd ../..

Configure the required ckpt files for 3D-Diffuser-Actor

Download our pretrained checkpoints for 3D-Diffuser-Actor:

pip install -U gdown

gdown "https://drive.google.com/uc?export=download&id=1QTKzDZvRUh3pVi-0ui1TT-CW7jlwZc6V" -O insert_onto_square_peg.pth
gdown "https://drive.google.com/uc?export=download&id=1VZjtIEVdSVjpCYM824PKWTMXW6AcjQi8" -O push_buttons.pth
gdown "https://drive.google.com/uc?export=download&id=1XHKYOMj2D5txC8LZBzkWU38Xjo1i3Pv7" -O slide_block_to_color_target.pth

Then configure them in downstream/world-in-world-manip/wiw_manip/configs/paths.py.

Install Openpi Client (Optional, needed for openpi-base and openpi-igenex)

Install the package openpi-client following the instructions in OpenPI repo

Configure OpenPI server

You can create a separate environment for the OpenPI server.

First, download our finetuned checkpoint for OpenPI:

gdown --id 1_a4KmB9x16L6lXx3AdJ_9bAqWjinG9UU -O openpi_wow_rlbench.tar

tar -xf openpi_wow_rlbench.tar -C <your_path_to_openpi_checkpoint>

Then, follow the instructions in OpenPI repo with this configuration:

TrainConfig(
    name="pi05_wow_rlbench_infer",
    model=pi0_config.Pi0Config(
        action_horizon=15,
        pi05=True,
        paligemma_variant="gemma_2b_lora",
        action_expert_variant="gemma_300m_lora",
    ),
    data=LeRobotLiberoDataConfig(
        repo_id="openpi_wow_rlbench/libero",
        base_config=DataConfig(
            prompt_from_task=True,
        ),
        extra_delta_transform=True,
    ),
    weight_loader=weight_loaders.CheckpointWeightLoader("gs://openpi-assets/checkpoints/pi05_base/params"),
),

Command for reference:

uv run scripts/serve_policy.py --port 8011 policy:checkpoint --policy.config=pi05_wow_rlbench_infer --policy.dir=checkpoints/openpi_wow_rlbench

Setup LIBERO backend (Optional, needed for libero_object and libero_spatial)

If you want to run the libero_object / libero_spatial tasks, we recommend using a dedicated uv environment to setup the libero server.

From repository root:

git submodule update --init --recursive third_party/libero
cd downstream/world-in-world-manip

uv venv --python 3.8 .venv-libero
source .venv-libero/bin/activate

uv pip sync requirements_libero.txt --extra-index-url https://download.pytorch.org/whl/cu113 --index-strategy unsafe-best-match

cd ../..
uv pip install -e third_party/libero

export PYTHONPATH=$PYTHONPATH:$PWD/third_party/libero:$PWD/downstream/world-in-world-manip
export LIBERO_CONFIG_PATH=$PWD/.cache/libero

Start LIBERO environment server

After the setup above:

cd /path/to/world-in-world
source downstream/world-in-world-manip/.venv-libero/bin/activate
export PYTHONPATH=$PYTHONPATH:$PWD/third_party/libero:$PWD/downstream/world-in-world-manip
export LIBERO_CONFIG_PATH=$PWD/.cache/libero
cd downstream/world-in-world-manip
python scripts/libero_env_server.py --host=127.0.0.1 --port=8765

↩︎ Back to Getting Started Checklist

Environment for different WMs

We create separate conda environments for different WMs to avoid dependency conflicts. Below are the environments used in our experiments for reference. Because many inference scripts use diffusers, it may be possible to reuse one environment across multiple WMs if dependencies are compatible.

Model	Version	Pipeline	Inference Script	Env Config	Setup Reference	Notes
Cosmos-predict2 2B	zero-shot; post-trained	Diffusers	`../downstream/api_models/cosmos_model.py`	`../downstream/api_models/env_config/cosmos.txt`	See setup in `cosmos_model.py` (lines 1–8): L1–L8	Env reused by SVD for zero-shot inference
HunyuanVideo-I2V	zero-shot	Diffusers	`../downstream/api_models/hunyuan_model.py`	`../downstream/api_models/env_config/hunyuan.txt`	See setup in `hunyuan_model.py` (lines 1–18): L1–L18	—
LTX-Video 2B	zero-shot; post-trained	Diffusers	`../downstream/api_models/ltx_model.py`	`../downstream/api_models/env_config/LTXvideo.txt`	See setup in `ltx_model.py` (lines 1–9): L1–L9	—
Wan2.1-I2V-A14B-480P-Diffusers	zero-shot	Diffusers	`../downstream/api_models/wan_model.py`	`../downstream/api_models/env_config/wan.txt`	See setup in `wan_model.py` (lines 1–5): L1–L5	—
Wan2.2-5B-Diffusers	zero-shot	Diffusers	`../downstream/api_models/wan22_ti2v_model.py`	`../downstream/api_models/env_config/wan22.txt`	See setup in `wan22_ti2v_model.py` (lines 7–12): L7–L12	-
SVD-1.5B	zero-shot	Diffusers	`../downstream/api_models/svd_model.py`	(same as Cosmos)	—	Uses Cosmos env.
Wan2.2-5B-diffsynth	post-trained	DiffSynth	`../downstream/api_models/wan_model_diffsynth.py`	`../downstream/api_models/env_config/wan_diffsynth.txt`	See setup in `wan_model_diffsynth.py` (lines 1–5): L1–L5	Base env for Wan2.2-A14B-diffsynth and Wan2.1-14B-diffsynth.
Wan2.2-A14B-diffsynth	post-trained	DiffSynth	`../downstream/api_models/wan_model_diffsynth.py`	(same as Wan2.2-5B-diffsynth)	—	Shares env with 5B.
Wan2.1-14B-diffsynth	post-trained	DiffSynth	`../downstream/api_models/wan_model_diffsynth.py`	(same as Wan2.2-5B-diffsynth)	—	Shares env with 5B.
SE3DS	zero-shot	Custom	`../downstream/api_models/se3ds_model.py`	`../downstream/api_models/env_config/se3ds.txt`	Follow SE3DS README (Setup Instructions): https://github.com/google-research/se3ds?tab=readme-ov-file#setup-instructions	—
Pathdreamer	zero-shot	Custom	`../downstream/api_models/pathdreamer_model.py`	(same as SE3DS)	—	Shares env with SE3DS.
Navigation World Model	zero-shot	Custom	`../downstream/api_models/nwm_model.py`	`../downstream/api_models/env_config/nwm.txt`	Follow NWM README (Requirements): https://github.com/facebookresearch/nwm/?tab=readme-ov-file#requirements	—
SVD	post-trained	Custom	to be released	to be released	to be released	Placeholders in doc.

↩︎ Back to Getting Started Checklist