GC-VLN: Instruction as Graph Constraints for Training-free Vision-and-Language Navigation

September 16, 2025 · View on GitHub

Paper | Project Page | Video

GC-VLN: Instruction as Graph Constraints for Training-free Vision-and-Language Navigation
Hang Yin*, Haoyu Wei*, Xiuwei Xu^\dagger, Wenxuan Guo, Jie Zhou, Jiwen Lu^\ddagger

* Equal contribution \dagger Project leader \ddagger Corresponding author

We propose a unified 3D graph representation for zero-shot vision-and-language navigation. By modeling instruction graph as constraints, we can solve the optimal navigation path accordingly. Wrong exploration can also be handled by graph-based backtracking.

News

  • [2025/09/16]: Arxiv and project page available. The code will be released in a few weeks!
  • [2025/08/01]: GC-VLN is accepted to CoRL 2025!

Demo

demo

Relevant Work

Check out our scene graph-based zero-shot navigation series:

  • SG-Nav for zero-shot object-goal navigation.
  • UniGoal for zero-shot goal-oriented navigation.

Citation

@article{yin2025gcvln, 
      title={GC-VLN: Instruction as Graph Constraints for Training-free Vision-and-Language Navigation}, 
      author={Hang Yin and Haoyu Wei and Xiuwei Xu and Wenxuan Guo and Jie Zhou and Jiwen Lu},
      journal={arXiv preprint arXiv:2509.10454},
      year={2025}
}