MARS-SQL: A MULTI-AGENT REINFORCEMENT LEARNING FRAMEWORK FOR TEXT-TO-SQL

December 4, 2025 ยท View on GitHub

arXiv License: MIT Python

This repository contains the official implementation of MARS-SQL.

๐Ÿงญ Overview

Pipeline

๐Ÿ“š Citation

@article{yang2025mars,
  title={MARS-SQL: A multi-agent reinforcement learning framework for Text-to-SQL},
  author={Yang, Haolin and Zhang, Jipeng and He, Zhitao and Fung, Yi R},
  journal={arXiv preprint arXiv:2511.01008},
  year={2025}
}

๐Ÿš€ Implementation

1. Training

Environment Setup

Please refer to Mars-train/Install.md for detailed environment installation instructions using uv and ray.

Dataset Preparation

  1. Download the BIRD dataset (dev/train databases) from the official BIRD benchmark page.
  2. Unzip the dataset and note the absolute path to the database directory.

โš™๏ธ Configuration (Crucial Step)

Before running the training script, you must update the BIRD database paths in the following three files to match your local setup:

  1. mars-train.sh
  2. Mars-train/verl/workers/agentic/llm_sql_agent/sqlact.py
  3. Mars-train/verl/workers/reward_manager/sql.py

Warning

Failure to update the database paths in all three locations will result in execution errors.

Run Training

Once configured, execute the training script:

bash mars-train.sh

2. Inference

We recommend running inference in a separate environment to avoid dependency conflicts.

Environment Setup

# (Optional, but recommended) Create and activate a new virtual environment
conda create -n mars-infer python=3.10 -y
conda activate mars-infer

# Install all required packages
cd MARS-SQL/Mars-inference
pip install -r requirements.txt

๐Ÿ’พ Using Pre-trained Models

Our trained MARS-SQL models (based on Qwen-7B) are publicly available on Hugging Face. You can download and use these weights directly for inference by updating the model path in the inference script:

Model NameDescriptionHugging Face Link
Qwen-SQL-7B-bird_5turns_80stepTrained with 5 turnsYanghl0526/Qwen-SQL-7B-bird_5turns_80step
Qwen-SQL-7B-bird_10turnTrained with 10 turnsYanghl0526/Qwen-SQL-7B-bird_10turn

Run Inference

The following command will generate 16 trajectories for each question in the dataset:

bash inference.sh

The output will be saved as step80_bird_@16_turn5_test_result.parquet

๐Ÿ“Š Evaluation

After generating the inference results (parquet file), use the evaluation script to calculate metrics.

python evaluate_sql.py --input_file step80_bird_@16_turn5_test_result.parquet --db_path Bird_DB_PATH