🚀 ProcTHOR RL 🚀

February 18, 2025 · View on GitHub

RL training scripts for learning an agent using ProcTHOR.

This codebase contains implementation of training and evaluation code used in ProcTHOR, Embodied-Codebook, and the LoCoBot experiments in PoliFormer.

💻 Installation 💻

🐳 Use Docker Image 🐳

Please refer to docker/README.md for more details.

🛠️ Manual Installation 🛠️

export MY_ENV_NAME=procthor-rl
export CONDA_BASE="$(dirname $(dirname "${CONDA_EXE}"))"
export PIP_SRC="${CONDA_BASE}/envs/${MY_ENV_NAME}/pipsrc"

conda create --name $MY_ENV_NAME python=3.9
conda activate $MY_ENV_NAME

pip install -r requirements.txt
pip install --no-cache-dir --extra-index-url https://ai2thor-pypi.allenai.org ai2thor==0+ca10d107fb46cb051dba99af484181fda9947a28
pip install --no-cache-dir torch==2.0.1 torchvision open_clip_torch objaverse objathor

🔍 Test prior package 🔍

Please test the package before running the training script:

python scripts/test_prior.py

You should see the loading of the procthor-10k dataset successfully. Otherwise, please make sure have your .git-credentials file in the root directory.

🏃‍♂️ Few Running Examples 🏃‍♂️

Training

Please note that we should now use CloudRendering for AI2THOR. If you find errors related to vulkan, please make sure install vulkan-tools, libvulkan1, and vulkan-utils correctly.️

Training without wandb logging:

python procthor_objectnav/main.py \
    experiment=procthor_objectnav/experiments/rgb_clipresnet50gru_ddppo \
    agent=locobot \
    target_object_types=robothor_habitat2022 \
    wandb.project=procthor-training \
    machine.num_train_processes=96 \
    machine.num_val_processes=4 \
    ai2thor.platform=CloudRendering \
    model.add_prev_actions_embedding=true \
    procthor.p_randomize_materials=0.8 \
    seed=100

Training with wandb logging:

export WANDB_API_KEY=YOUR_WANDB_API_KEY
python procthor_objectnav/main.py \
    experiment=procthor_objectnav/experiments/rgb_clipresnet50gru_ddppo \
    agent=locobot \
    target_object_types=robothor_habitat2022 \
    wandb.project=procthor-training \
    machine.num_train_processes=96 \
    machine.num_val_processes=4 \
    ai2thor.platform=CloudRendering \
    model.add_prev_actions_embedding=true \
    procthor.p_randomize_materials=0.8 \
    wandb.name=YOUR_WANDB_RUN_NAME \
    wandb.project=YOUR_WANDB_PROJECT_NAME \
    wandb.entity=YOUR_WANDB_ENTITY_NAME \
    callbacks=wandb_logging_callback \
    seed=100

Training with Codebook bottleneck:

python procthor_objectnav/main.py \
    experiment=procthor_objectnav/experiments/rgb_clipresnet50gru_codebook_ddppo \
    agent=locobot \
    target_object_types=robothor_habitat2022 \
    wandb.project=procthor-training \
    machine.num_train_processes=96 \
    machine.num_val_processes=4 \
    ai2thor.platform=CloudRendering \
    model.add_prev_actions_embedding=true \
    procthor.p_randomize_materials=0.8 \
    seed=100

You can change the default values of the codebook hyperparameters such as the codebook size (model.codebook.size and model.codebook.code_dim) and codebook dropout (model.codebook.dropout):

python procthor_objectnav/main.py \
    experiment=procthor_objectnav/experiments/rgb_clipresnet50gru_codebook_ddppo \
    agent=locobot \
    target_object_types=robothor_habitat2022 \
    wandb.project=procthor-training \
    machine.num_train_processes=96 \
    machine.num_val_processes=4 \
    ai2thor.platform=CloudRendering \
    model.add_prev_actions_embedding=true \
    procthor.p_randomize_materials=0.8 \
    model.codebook.size=256 \
    model.codebook.code_dim=10 \
    model.codebook.dropout=0.1 \
    seed=100

Training with DINOv2 visual encoder:

python procthor_objectnav/main.py \
    experiment=procthor_objectnav/experiments/rgb_dinov2gru_ddppo \
    agent=locobot \
    target_object_types=robothor_habitat2022 \
    wandb.project=procthor-training \
    machine.num_train_processes=96 \
    machine.num_val_processes=4 \
    ai2thor.platform=CloudRendering \
    model.add_prev_actions_embedding=true \
    procthor.p_randomize_materials=0.8 \
    seed=100

Training with DINOv2 ViTb + Transformer Encoder and Causal Transformer (PoliFormer):

python procthor_objectnav/main.py \
    experiment=procthor_objectnav/experiments/rgb_dinov2tsfm_ddppo \
    agent=locobot \
    target_object_types=robothor_habitat2022 \
    wandb.project=procthor-training \
    machine.num_train_processes=96 \
    machine.num_val_processes=4 \
    ai2thor.platform=CloudRendering \
    model.add_prev_actions_embedding=true \
    procthor.p_randomize_materials=0.8 \
    model.dino.model_type=dinov2_vitb14 \
    training.use_transformer_encoder=true \
    seed=100

💾 Download Pretrained Checkpoint 💾

Use scripts/download_ckpt.py to download the pretrained checkpoint:

python scripts/download_ckpt.py --save_dir YOUR_CKPT_DIR --ckpt_ids CKPT_ID

Options for CKPT_ID: CLIP-GRU, DINOv2-GRU, CLIP-CodeBook-GRU, DINOv2-CodeBook-GRU, DINOv2-ViTs-TSFM, DINOv2-ViTb-TSFM.

📊 Evaluation 📊

Evaluate in ArchitecTHOR, ProcTHOR-10k, iTHOR, or RoboTHOR:

export WANDB_API_KEY=YOUR_WANDB_API_KEY
python procthor_objectnav/main.py \
    experiment=procthor_objectnav/experiments/rgb_clipresnet50gru_ddppo \
    agent=locobot \
    target_object_types=robothor_habitat2022 \
    machine.num_train_processes=1 \
    machine.num_test_processes=20 \
    ai2thor.platform=CloudRendering \
    model.add_prev_actions_embedding=true \
    callbacks=wandb_logging_callback \
    seed=100 \
    eval=true \
    evaluation.tasks=["architecthor"] \
    evaluation.minival=false \
    checkpoint=YOUR_CHECKPOINT \
    wandb.name=YOUR_WANDB_RUN_NAME \
    wandb.project=YOUR_WANDB_PROJECT_NAME \
    wandb.entity=YOUR_WANDB_ENTITY_NAME \
    visualize=true \ # store qualitative results
    output_dir=YOUR_OUTPUT_DIR # dir to store both qualitative and quantitative results

evaluation.tasks can be architecthor, procthor-10k, ithor, or robothor. procthor_objectnav/experiments/rgb_clipresnet50gru_ddppo can be replaced with other experiments such as rgb_clipresnet50gru_codebook_ddppo, rgb_dinov2gru_ddppo, rgb_dinov2tsfm_ddppo, etc.

📚 Reference 📚

If you find this codebase useful, please consider citing ProcTHOR, Embodied-Codebook, PoliFormer, and AllenAct:

@inproceedings{deitke2022️,
  title={🏘️ ProcTHOR: Large-Scale Embodied AI Using Procedural Generation},
  author={Deitke, Matt and VanderBilt, Eli and Herrasti, Alvaro and Weihs, Luca and Ehsani, Kiana and Salvador, Jordi and Han, Winson and Kolve, Eric and Kembhavi, Aniruddha and Mottaghi, Roozbeh},
  booktitle={NeurIPS},
  year={2022}
}

@inproceedings{eftekhar2023selective,
  title={Selective Visual Representations Improve Convergence and Generalization for Embodied AI},
  author={Eftekhar, Ainaz and Zeng, Kuo-Hao and Duan, Jiafei and Farhadi, Ali and Kembhavi, Ani and Krishna, Ranjay},
  booktitle={ICLR},
  year={2024}
}

@inproceedings{zeng2024poliformer,
  title={Poliformer: Scaling on-policy rl with transformers results in masterful navigators},
  author={Zeng, Kuo-Hao and Zhang, Zichen and Ehsani, Kiana and Hendrix, Rose and Salvador, Jordi and Herrasti, Alvaro and Girshick, Ross and Kembhavi, Aniruddha and Weihs, Luca},
  booktitle={CoRL},
  year={2024}
}

@article{AllenAct,
  author = {Luca Weihs and Jordi Salvador and Klemen Kotar and Unnat Jain and Kuo-Hao Zeng and Roozbeh Mottaghi and Aniruddha Kembhavi},
  title = {AllenAct: A Framework for Embodied AI Research},
  journal = {arXiv preprint arXiv:2008.12760},
  year = {2020},
}