README.md

March 13, 2026 · View on GitHub

NOW Overview Diagram

For acquiring and processing public datasets, please follow NoMaD. For Habitat datasets, install Habitat with conda install habitat-sim==0.2.4 withbullet headless -c conda-forge -c aihabitat, then use collect_datasets.py to collect trajectories in the format required by this project. We use the Matterport3D (MP3D) dataset for data collection.

1.2 Dataset Structure

Your dataset must follow the directory structure below. If you are collecting a custom dataset, please organize it accordingly:

├── <dataset_name>
│   ├── <name_of_traj1>
│   │   ├── 0.jpg
│   │   ├── 1.jpg
│   │   ├── ...
│   │   ├── T_1.jpg
│   │   └── traj_data.pkl
│   ├── <name_of_traj2>
│   │   ├── 0.jpg
│   │   ├── ...
│   │   └── traj_data.pkl
│   ...
└── └── <name_of_trajN>
        ├── 0.jpg
        ├── ...
        └── traj_data.pkl

1.3 Data Splitting

Use data_split.py to split your data into training and testing sets.

Split Training Data:

python data_split.py -i <path_to_train_data> -d <train_dataset_name>

Split Test Data:

python data_split.py -i <path_to_test_data> -d <test_dataset_name> -s 0

2. Training

Train the model using the provided configuration file:

python train.py --config config/config_shortcut_w_pretrain.yaml

3. Generation Performance Evaluation

Evaluation involves three steps: preparing ground truth frames, generating future frames, and calculating metrics (LPIPS, DreamSim, FID).

Step 1: Prepare Ground Truth Frames

python isolated_infer.py --exp logs/<run_name> --ckp latest --datasets <test_dataset_name> --gt 1

Step 2: Generate Future Frames

python isolated_infer.py --exp logs/<run_name> --ckp latest --datasets <test_dataset_name> --gt 0

Step 3: Calculate Metrics

python isolated_eval.py --gt_dir output/gt --exp_dir output/<run_name>_latest --datasets <test_dataset_name>

python inference.py

Citation

If you find this work useful in your research, please consider citing:

@misc{shen2026efficientmultimodalnavigationonestep,
      title={An Efficient and Multi-Modal Navigation System with One-Step World Model}, 
      author={Wangtian Shen and Ziyang Meng and Jinming Ma and Mingliang Zhou and Diyun Xiang},
      year={2026},
      eprint={2601.12277},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2601.12277}, 
}

README.md

Training and Inference

1. Data Preparation

1.1 Preparing Datasets

1.2 Dataset Structure

1.3 Data Splitting

2. Training

3. Generation Performance Evaluation

4. Inference

4.1 Prerequisites

4.2 Run Inference

Citation