VLMLight: Traffic Signal Control via Vision-Language Meta-Control and Dual-Branch Reasoning
July 10, 2025 ยท View on GitHub

Official implementation of VLMLight: Traffic Signal Control via Vision-Language Meta-Control and Dual-Branch Reasoning.
๐ News
- [June 2025] Codebase open-sourced.
- [May 2025] Initial preprint released on arXiv, VLMLight.
๐ Overview
VLMLight presents a novel vision-language multimodal framework for adaptive traffic signal control, featuring:
- The first vision-based traffic control system utilizing visual foundation models for scene understanding;
- A dual-branch architecture combining fast RL policies with deliberative LLM reasoning
- Enhanced handling of safety-critical scenarios through multi-agent collaboration
โจ Key Features
Image-Based Traffic Simulation
First multi-view visual traffic simulator enabling context-aware decision making:
| BEV | North | East | South | West |
|---|---|---|---|---|
![]() | ![]() | ![]() | ![]() | ![]() |
Dual-Branch Architecture
- Fast RL Policy: Efficient handling of routine traffic
- Deliberative Reasoning: Structured analysis for complex scenarios
- Meta-Controller: Dynamic branch selection based on real-time context
Safety-Critical Event Handling
Specialized pipeline for emergency vehicle prioritization:
Deliberative Reasoning policy for complex traffic in Massy.
๐ ๏ธ Installation
- Install TransSimHub:
git clone https://github.com/Traffic-Alpha/TransSimHub.git
cd TransSimHub
pip install -e ".[all]"
- Install Qwen-Agent:
pip install -U "qwen-agent[gui,rag,code_interpreter,mcp]"
# Or use `pip install -U qwen-agent` for the minimal requirements.
# The optional requirements, specified in double brackets, are:
# [gui] for Gradio-based GUI support;
# [rag] for RAG support;
# [code_interpreter] for Code Interpreter support;
# [mcp] for MCP support.
๐ Getting Started
VLMLight provides both English and Chinese implementations. The following examples demonstrate the English version usage. For Chinese version, simply replace vlm_tsc_en with vlm_tsc_zh in all paths and commands.
1. Model Configuration
Configure your LLM/VLM endpoints in vlm_tsc_en/vlmlight_decision.py:
llm_cfg = {
'model': 'Qwen/Qwen2.5-72B-Instruct-AWQ',
'model_type': 'oai',
'model_server': 'http://localhost:5070/v1',
'api_key': 'token-abc123',
'generate_cfg': {
'top_p': 0.8,
}
} # Language Model
llm_cfg_json = {
'model': 'Qwen/Qwen2.5-72B-Instruct-AWQ',
'model_type': 'oai',
'model_server': 'http://localhost:5070/v1',
'api_key': 'token-abc123',
'generate_cfg': {
'top_p': 0.8,
'response_format': {"type": "json_object"},
}
} # Language Model
vlm_cfg = {
'model': 'Qwen/Qwen2.5-VL-32B-Instruct-AWQ',
'model_type': 'qwenvl_oai',
'model_server': 'http://localhost:5030/v1',
'api_key': 'token-abc123',
'generate_cfg': {
'top_p': 0.8,
}
} # Vision Language Model
2. RL Policy Training
Train RL policies for baseline control:
cd rl_tsc
python train_rl_tsc.py
Pretrained models available in rl_tsc/results:
| Hongkong YMT | France Massy | SouthKorea Songdo |
|---|---|---|
![]() | ![]() | ![]() |
3. Run VLMLight
Execute the decision pipeline:
cd vlm_tsc_en
python vlmlight_decision.py
๐ Repository Structure
.
โโโ assets/ # Visual assets for documentation
โโโ result_analysis/ # Trip information analysis tools
โ โโโ analysis_tripinfo.py # Performance metric calculation
โโโ rl_tsc/ # Reinforcement learning components
โ โโโ _config.py # RL training configuration
โ โโโ eval_rl_tsc.py # RL policy evaluation
โ โโโ train_rl_tsc.py # RL policy training
โ โโโ utils/ # RL helper functions
โโโ sim_envs/ # Traffic simulation scenarios
โ โโโ France_Massy/ # Massy, France intersection
โ โโโ Hongkong_YMT/ # YMT, Hong Kong intersection
โ โโโ SouthKorea_Songdo/ # Songdo, South Korea intersection
โโโ vlm_tsc_en/ # English version implementation
โ โโโ _config.py # English agent configuration
โ โโโ utils/ # English processing utilities
โ โโโ vlmlight_decision.py # English decision pipeline
โโโ vlm_tsc_zh/ # Chinese version implementation
โโโ _config.py # Chinese agent configuration
โโโ utils/ # Chinese processing utilities
โโโ vlmlight_decision.py # Chinese decision pipeline
๐ Citation
If you find this work useful, please cite our papers:
@article{wang2025vlmlight,
title={VLMLight: Traffic Signal Control via Vision-Language Meta-Control and Dual-Branch Reasoning},
author={Wang, Maonan and Chen, Yirong and Pang, Aoyu and Cai, Yuxin and Chen, Chung Shue and Kan, Yuheng and Pun, Man-On},
journal={arXiv preprint arXiv:2505.19486},
year={2025}
}
๐ Acknowledgements
We thank our collaborators from SenseTime and Shanghai AI Lab (in alphabetical order):
- Yuheng Kan (้ๅฎ่กก)
- Zian Ma (้ฉฌๅญๅฎ)
- Chengcheng Xu (ๅพๆฟๆ)
for their contributions to the TransSimHub simulator development.
๐ซ Contact
If you have any questions, please open an issue in this repository. We will respond as soon as possible.







