Data and Weights Preparation
July 14, 2025 · View on GitHub
This guide walks you through preparing datasets and pretrained weights required for HERMES.
1. Prepare nuScenes Dataset
-
First, create a
datadirectory:mkdir -p data -
Place or symlink your nuScenes dataset to the
datadirectory:ln -s /path/to/your/nuscenes data/nuscenes -
Download the
.pklfiles andmask_cam_img.jpgtodata/nuscenes/:huggingface-cli download LMD0311/HERMES --include="*pkl" --local-dir ./ mv ./data/*pkl ./data/nuscenes/ huggingface-cli download LMD0311/HERMES --include="*mask_cam_img.jpg" --local-dir ./ mv ./data/*jpg ./data/nuscenes/
2. Prepare Text Annotations
- Download and unzip the following files into the
datadirectory:huggingface-cli download LMD0311/HERMES --include="*zip" --local-dir ./ unzip data/omnidrive_nusc.zip -d data/ unzip data/NuInteractCaption.zip -d data/
3. Prepare Pretrained Weights
a) Download InternVL-2 Pretraining
cd projects/mmdet3d_plugin/models/internvl_chat
mkdir pretrained
cd pretrained
huggingface-cli download OpenGVLab/InternVL2-2B --local-dir InternVL2-2B
b) Download Project Checkpoints
huggingface-cli download LMD0311/HERMES --include="ckpt/*" --local-dir ./
Directory Structure
Your project directory should look like this after setup:
HERMES
├── data
│ ├── nuscenes
│ │ ├── mask_cam_img.jpg
│ │ ├── nuscenes_advanced_12Hz_infos_train.pkl
│ │ ├── nuscenes_masked_only_infos_temporal_train.pkl
│ │ ├── nuscenes_infos_temporal_train.pkl
│ │ └── nuscenes_infos_temporal_val.pkl
│ ├── omnidrive_nusc
│ └── NuInteractCaption
├── ckpt
│ ├── open_clip_convnext_base_w-320_laion_aesthetic-s13B-b82k.bin
│ ├── hermes_stage1.pth
│ ├── hermes_stage2_1.pth
│ ├── hermes_stage2_2.pth
│ └── hermes_final.pth
Please refer to Usage.md for instructions on HERMES training and evaluation.