Usage Guide
July 13, 2025 ยท View on GitHub
We have tested HERMES on NVIDIA H20 servers and recommend using GPUs with at least 80GB memory for smooth operation.
We plan to release DeepSpeed deployment support in the future to further reduce GPU memory usage.
If you are using NVIDIA H20 96GB, you may remove some checkpoint usage in hermes.py and hermes_future_render_head.py for improved efficiency.
1. Training
Multi-GPU / Multi-Node Training
torchrun --nproc_per_node $PROC_PER_NODE \
--master_addr $MASTER_ADDR \
--master_port $MASTER_PORT \
--nnodes $NODE_COUNT \
--node_rank $NODE_RANK \
extra_tools/train.py \
projects/configs/hermes/stage3.py \
--work-dir ./work-dir \
--launcher pytorch \
--no-validate
Single-GPU Debugging
python extra_tools/train.py projects/configs/hermes/stage3.py --work-dir ./work-dir --no-validate
You can modify different config files to train different stages as needed.
Checkpoint Cleanup (Optional)
Since we use mmcv for training pipeline management, the saved checkpoints include the frozen LLM pretrained parameters, which can be large.
You can use our provided script to remove unnecessary parts and save storage space:
python extra_tools/ckpt_convertor.py ./path/to/your_custom_trained.pth --delete_optimizer --save_path path/to/your_custom_trained_cleaned.pth
Both --delete_optimizer and --save_path are optional.
2. Inference (Testing)
torchrun --nproc_per_node $PROC_PER_NODE \
--master_addr $MASTER_ADDR \
--master_port $MASTER_PORT \
--nnodes $NODE_COUNT \
--node_rank $NODE_RANK \
extra_tools/test.py \
projects/configs/hermes/stage3.py \
ckpt/hermes_final.pth \
--launcher pytorch
- The results for each scenario will be saved under
outputs/stage3/hermes_eval/results. - As nuScenes contains a large number of scenes, we recommend using multi-GPU/multi-node inference.
3. Evaluation
We evaluate the results by reading the saved JSON files and calculating Chamfer Distance and text metrics.
Install necessary libraries:
pip install pycocoevalcap nltk openai pyfiglet
python -c "import nltk; nltk.download('punkt'); nltk.download('wordnet'); nltk.download('punkt_tab')"
Run evaluation
python extra_tools/eval_hermes_results.py ./outputs/stage3/hermes_eval/results
Due to the nature of LLMs, inference results may vary slightly each time.
For more details and custom configuration, please refer to the config files.