PMVis

January 26, 2026 · View on GitHub

This repository contains the code implementation for the paper "Towards Reliable Agent-based Progressive Text-to-Visualization Translation with Explicit Interaction and Verification Rules".

🏗️ System Architecture

System Architecture

Core Components

Component	File	Description
User Agent	`agents/user_agent.py`	Simulates user multi-turn questioning behavior
System Agent	`agents/system_agent.py`	LLM-based VQL generation
Clarification Agent	`agents/react_clarification_agent.py`	ReAct-based VQL clarification and quality assurance
Interaction Manager	`agents/interaction_manager.py`	Coordinates the three-agent interaction loop
Tool Manager	`agents/clarification_tools_v2.py`	Clarification toolset management

🛠️ Clarification Toolset

The Clarification Agent includes four built-in verification tools:

Tool Name	Function
`sql_executor`	Checks the executability of the SQL portion in VQL on the database
`syntax_checker`	Validates VQL syntax correctness and compliance
`schema_validator`	Verifies that table and column names exist in the database schema
`intent_matcher`	Clarifies user's true intent and double-checks the correctness of generated VQL

📁 Project Structure

PMVis/
├── agents/                          # Agent modules
│   ├── __init__.py                  # Module initialization
│   ├── user_agent.py                # User Agent
│   ├── system_agent.py              # System Agent
│   ├── react_clarification_agent.py # ReAct Clarification Agent
│   ├── interaction_manager.py       # Interaction Manager
│   └── clarification_tools_v2.py    # Clarification toolset
├── eval/                            # Evaluation modules
│   ├── gemini/                      # Gemini model evaluation
│   ├── gpt-4o-mini/                 # GPT-4o-mini model evaluation
│   ├── qwen/                        # Qwen model evaluation
│   └── user_study/                  # User study evaluation
├── main.py                          # Sequential main program
├── main_parallel.py                 # Parallel main program
├── parallel_config.py               # Parallel configuration
├── util.py                          # Utility functions
├── logger_config.py                 # Logger configuration
├── logger_config_parallel.py        # Parallel logger configuration
├── dataset.py                       # Dataset processing

🚀 Quick Start

Requirements

Python 3.10
Dependencies: openai, pandas, psutil, sqlite3, nltk, tqdm

Install Dependencies

pip install openai pandas nltk tqdm psutil

Configure API Keys

Create a config.py file and configure your API keys:

# API Configuration

# OpenAI API Configuration
OPENAI_BASE_URL = 'your_openai_base_url'
OPENAI_API_KEY = 'your_openai_api_key'

# Qwen API Configuration (Alibaba Cloud)
QWEN_BASE_URL = 'your_qwen_api_url'
QWEN_API_KEY = 'your_qwen_api_key'

# Default Model Configuration
DEFAULT_MODEL = 'qwen-plus'  # Options: 'gpt-4o-mini', 'gemini-2.5-flash-lite', 'qwen-plus'

# Database Path Configuration
DATABASE_PATH = 'source_dataset/databases'

Run the Program

Sequential Execution

python main.py

Parallel Execution (Recommended)

python main_parallel.py

Configuration Options

You can adjust the following parameters in parallel_config.py:

# Processing Mode Configuration
MODE = "single_turn"          # "single_turn" or "multi_turn"
ENABLE_CLARIFICATION = True   # Enable/disable clarification feature

# Parallel Parameters Configuration
MANUAL_CONFIG = {
    'max_workers': 6,    # Number of parallel processes
    'max_items': 0,      # Maximum number of items to process (0 = all)
}

Results

Our main results of each LLM backbone in the paper can be found in eval/ folder.

📝 Logging System

The system provides detailed logging:

Sequential Mode: All logs output to console and a single log file
Parallel Mode: Each worker process has an independent log file for easier debugging

Log files are saved in the logs/ directory.