FRAPPE: Infusing World Modeling into Generalist Policies via Multiple Future Representation Alignment
February 24, 2026 Β· View on GitHub
:loudspeaker: News!
- [2026/2/20] We released our paper on ArXiv.
π Overview

Contents
Environment Setup
First, clone and install the RoboTwin repo and required packages. You can follow the guidance in RoboTwin Document.
git clone https://github.com/RoboTwin-Platform/RoboTwin.git
conda create -n frappe python=3.10 -y
conda activate frappe
bash script/_install.sh #Install RoboTwin basic envs and CuRobo
bash script/_download_assets.sh #Download assets (RoboTwin-OD, Texture Library and Embodiments)
Then we can continue to set up the environment for environment.
# Make sure python version == 3.10
conda activate frappe
# Install pytorch
# Look up https://pytorch.org/get-started/previous-versions/ with your cuda version for a correct command
pip install torch==2.1.0 torchvision==0.16.0 --index-url https://download.pytorch.org/whl/cu121
# Install packaging
pip install packaging==24.0
pip install ninja
# Verify Ninja --> should return exit code "0"
ninja --version; echo $?
# Install flash-attn
pip install flash-attn==2.7.2.post1 --no-build-isolation
# Install other prequisites
pip install -r requirements.txt
Then clone our repo as a policy of the RoboTwin, the directory structure will be as below:
cd policy
git clone https://github.com/Jbo-Wang/frappe.git
RoboTwin
βββ policy
Β· βββ FRAPPE
β
βββ other policys ...
Installation
Download the pretrained ckpt nad Encoders we will use in the training stage.
# In the RoboTwin ROOT directory
cd policy
mkdir weights
cd weights
mkdir RDT && cd RDT
# Download the models
huggingface-cli download google/t5-v1_1-xxl --local-dir t5-v1_1-xxl
huggingface-cli download google/siglip-so400m-patch14-384 --local-dir siglip-so400m-patch14-384
huggingface-cli download robotics-diffusion-transformer/rdt-1b --local-dir rdt-1b
# Teacher eocders
#theia
huggingface-cli download theaiinstitute/theia-base-patch16-224-cdiv --local-dir theia-base-patch16-224-cdiv
#clip
huggingface-cli download laion/CLIP-ViT-H-14-laion2B-s32B-b79K --local-dir CLIP-ViT-H-14-laion2B-s32B-b79K
#vit
huggingface-cli download google/vit-huge-patch14-224-in21k --local-dir vit-huge-patch14-224-in21k
#dinov2
git clone https://github.com/facebookresearch/dinov2.git
cd dinov2-main
mkdir checkpoints && cd checkpoints
huggingface-cli download facebook/dinov2-base
Then update your real paths of teacher encoders (Theia, CLIP, VIT, DINOv2) in the utils.py.
Getting Started
You can download the checkpoints from Huggingface.
The directory structure will be as below:
flappe
βββ checkpoints
Β· βββ flappe_taskxxx
β βββcheckpoint-xxx
βββ ...
We offer a inference example for our method (eval.sh).
model_name should be the checkpoint file name under the ./checkpoints folders.
conda activate frappe
bash eval.sh
Training
We offer a training example for our method. It contains two stage training:
- mid-training (finetune_mid.sh & model_config/mid_train.yml)
For mid_train.yml, you should update the path of the pretrained ckpts and the pretrained_model_name_or_path; - post-training (finetune_post.sh & model_config/post_train.yml) For post_train.yml, you should update the path of the mid-train ckpts, the pretrained_model_name_or_path and the teacher encoder paths;
conda activate frappe
bash finetune_mid.sh # or bash finetune_post.sh
The default configurations match the experimental setup in our paper.
π₯ TODO List
β
Training and inference code on RoboTwin2.0
π Contact
For further discussion and collaboration, please feel free to contact us via Email and WeChat:
| Author | ||
|---|---|---|
| Han Zhao | zhaohan34@westlake.edu.cn | este_zh |
| Jingbo Wang | guangtouchangkaishen@outlook.com | guangtouchangkaishen |
| Wenxuan Song | songwenxuan0115@gmail.com | swx0757 |
β€οΈ Acknowledgement
We thank these great works and open-source codebases: RDT & Theia
π Citation
If you find this work useful, please cite:
@article{zhao2026frappe,
title={FRAPPE: Infusing World Modeling into Generalist Policies via Multiple Future Representation Alignment},
author={Han Zhao and Jingbo Wang and Wenxuan Song and Shuai Chen and Yang Liu and Yan Wang and Haoang Li and Donglin Wang},
journal = {arXiv preprint arXiv:2602.17259},
year={2026},
}