ICLR 2026 OC-STORM

May 29, 2026 · View on GitHub

Object-Centric World Models from Few-Shot Annotations for Sample-Efficient Reinforcement Learning [Paper Link]

Weipu Zhang, Adam Jelley, Trevor McInroe, Amos Storkey, Gang Wang

Work was initiated at the University of Edinburgh and completed at the Beijing Institute of Technology.

Watch our video demo above to see the amazing fights played by RL agents!

TL;DR: OC-STORM is an object-centric world-model RL framework that uses few-shot segmentation annotations to improve sample efficiency in Atari and Hollow Knight.

OC-STORM main figure

Environment installation

Create conda environment:
```
conda create -n oc-storm python=3.12
```
Activate environment:
```
conda activate oc-storm
```
Install Python dependencies:
```
pip install -r requirements.txt
```
Download CUTIE model weights and segmentation masks:

These assets are not required to run STORM itself. They are only needed for OC-STORM, and are not required if you are only interested in running STORM on Hollow Knight.
```
bash scripts/download.sh
```
Afterwards, the folder feature_extractor/cutie/weights should contain coco_lvis_h18_itermask.pth and cutie-small-mega.pth, and the project root should contain segmentation_masks folder (unless the .tar file was not extracted).

Or download and extract manually if you prefer: coco_lvis_h18_itermask.pth | cutie-small-mega.pth | segmentation_masks.tar

For Atari games, the environment setup is complete after completing this step.
For Hollow Knight installation and configuration: hollow_knight.md

Computational requirements

Most of our runs are conducted on 3090/4090, and we recommend using similar devices.

For Atari, a GPU with memory >= 11GB is preferred.

Mask generation

Following Cutie, the prompt masks are labeled with RITM, then converted to class-indexed PNG format.

Train, Evaluate, and Monitor

Train:

./scripts/train.sh

Evaluate:

./scripts/eval.sh

Monitor with TensorBoard:

./scripts/tensorboard.sh

Stop background training processes (WARN: Read this first and use at your own risk):

./scripts/kill.sh

Citation

@inproceedings{
    zhang2026objectcentric,
    title={Object-Centric World Models from Few-Shot Annotations for Sample-Efficient Reinforcement Learning},
    author={Weipu Zhang and Adam Jelley and Trevor McInroe and Amos Storkey and Gang Wang},
    booktitle={The Fourteenth International Conference on Learning Representations},
    year={2026},
    url={https://openreview.net/forum?id=qmEyJadwHA}
}