Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models

February 17, 2026 · View on GitHub

Official Implementation

Paper | Project Page

Arian Mousakhan, Sudhanshu Mittal , Silvio Galesso, Karim Farid, Thomas Brox
University of Freiburg

Teaser

Installation

git clone https://github.com/lmb-freiburg/orbis.git
cd orbis
conda env create -f environment.yml
conda activate orbis_env

Checkpoints

Link to the Checkpoints on Huggingface.

Move the checkpoint in the relevant experiment directory, e.g.:

mv last.ckpt logs_wm/orbis_288x512/checkpoints

If you only need the tokenizer move the tokenizer checkpoint in the relevant experiment directory, e.g.:

mv last.ckpt logs_tk/tokenizer_288x512/checkpoints

Define the path to tokenizer and world model:

export TK_WORK_DIR=PATH_TO_logs_tk
export WM_WORK_DIR=PATH_TO_logs_wm

Autoregressive Video Generation (Roll-out)

To roll-out using the example input frames, use:

python evaluate/rollout.py --exp_dir logs_wm/orbis_288x512 --num_gen_frames 120 --num_steps 30

Alternatively, you can either specify a configuration file for the inference data:

python evaluate/rollout.py --exp_dir STAGE2_EXPERIMENT_DIR --val_config val_config.yaml --num_gen_frames 120 --num_steps 30

or modify the frame paths in the default configuration file.

To reproduce the paper results on you can find the validation sets here.

Training

Tokenizer

You can train a first stage model (tokenizer) with the following command:

python main.py  --base configs/stage1.yaml --name stage1 -t --n_gpus 1 --n_nodes 1

You can train the first stage in Image Folder style using the following command:

python main.py  --base configs/stage1_if.yaml --name stage1 -t --n_gpus 1 --n_nodes 1

You may need to adapt a few paths in the config file to your own system, or you can override them from CLI.

Features

In the config file:

cont_ratio_training: Sets the ratio at which vector quantization is bypassed, allowing the decoder to receive continuous (non-discrete) tokens.
only_decoder: If set to true, only the decoder is trained while other components remain frozen.
scale_equivariance: Enables Scale Equivariance training. se_weight is defiend as SE loss temperature.
You can use callbacks/logging.py to calculate enc_scale in tokenizer config.

🔮 Video Generation Model

You can train a second stage model with the following command:

python main.py  --base configs/stage2.yaml --name stage2 -t --n_gpus 1 --n_nodes 1

If you trained Image Folder tokenizer you can use the following command line:

python main.py  --base configs/stage2_if.yaml --name stage2 -t --n_gpus 1 --n_nodes 1

You may need to adapt a few paths in the config file to your own system, or you can override them from CLI.

Additional points to consider

Two variables $TK_WORK_DIR and$ WM_WORK_DIR are defined that refer to tokenizer and World Model directory. By setting them, experiment outputs will be automatically saved in the specified directory.

Acknowledgement

This codebase builds upon several excellent open-source projects, including:

We sincerely thank the authors for making their work publicly available.

BibTeX

@article{mousakhan2025orbis,
  title={Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models},
  author={Mousakhan, Arian and Mittal, Sudhanshu and Galesso, Silvio and Farid, Karim and Brox, Thomas},
  journal={arXiv preprint arXiv:2507.13162},
  year={2025}
}