๐Ÿฆœ PARROT

November 11, 2025 ยท View on GitHub

๐Ÿฆœ PARROT

Practical And Realistic BenchmaRk for crOss-system SQL Translation

Leaderboard Samples Dialects Python License

PARROT

The first comprehensive benchmark for evaluating cross-system SQL translation systems

Leaderboard โ€ข Documentation โ€ข Submit Results โ€ข Paper


๐Ÿ“ข News

  • 09/2025: Our paper "PARROT: A Benchmark for Evaluating LLMs in Cross-System SQL Translation" has been accepted by NeurIPS 2025! :tada: :tada: :tada:
  • 05/2025: We have released PARROT-1.0 (28,003 translation pairs from 38 open-source benchmarks for extensive syntax testing) and published the leaderboard.

โœจ Key Features

PARROT
๐ŸŽฏ Comprehensive๐Ÿ”ง Production-Ready๐Ÿงช Well-Tested๐ŸŒ Multi-Dialect
598 curated pairs from 38+ benchmarksReal-world workloads & production dataBuilt-in validators & parsers10+ SQL dialects supported

๐ŸŒŸ Why PARROT?

  • โœ… 598 Translation Pairs from 38+ public benchmarks and production-derived workloads
  • ๐Ÿง  Broad Dialect Coverage: PostgreSQL, MySQL, SQLite, Oracle, SQL Server, Db2, DuckDB, Trino, Hive, Snowflake, and more
  • ๐Ÿงช Built-in Validators: Comprehensive parsers and executability checks for multiple engines
  • ๐Ÿ› ๏ธ Complete Toolkit: Preprocessing utilities and baseline translation tools included
  • ๐Ÿ“Š Rigorous Evaluation: Multi-dimensional scoring (syntax and execution)
  • ๐Ÿ† Live Leaderboard: Track your progress and compete with the community

๐Ÿ“ค Submissions

๐Ÿ† Ready to compete? Submit your system now!

Submit

Submission Process

  1. ๐Ÿ“‹ Prepare Outputs

    • Follow the example in Submission_Example/20250928_LLMTranslator_ExampleTeam.zip
    • Ensure proper folder structure and file formats
  2. ๐Ÿ“– Read Guidelines

    • Review Submission_Example/PARROT Submission Guidelines.md
    • Check format requirements and naming conventions
  3. ๐Ÿ“ Include System Description

    • Approach and methodology
    • Models and versions used
    • Rules and heuristics applied
    • Training data sources
    • Compute resources
  4. ๐Ÿš€ Submit

    • Upload via the leaderboard site
    • Wait for evaluation results

๐Ÿ“‹ Requirements Checklist

  • Consistent model versions and random seeds
  • Clear indication of supported dialect pairs
  • Valid UTF-8 text file outputs
  • Exact versions of LLM prompts/rule files included
  • System description document included
  • Reproducibility instructions provided

โš ๏ธ Important: Include exact versions of all dependencies, prompts, and rule files for reproducibility.


๐Ÿ Leaderboard Rules

RuleDescription
โฑ๏ธ FrequencyOne submission per team per month (TBD)
๐Ÿ“ TransparencyDisclose all training data and public resources
๐Ÿท๏ธ DocumentationClearly mark manual rules or prompts
๐Ÿšซ FairnessNo test set contamination or hand-tuning
โœ… VerificationResults may be verified; additional materials may be requested

๐Ÿงฑ Baselines

We recommend to refer to an LLM-based baseline CrackSQL.

CrackSQL is a powerful SQL dialect translation tool that integrates rule-based strategies with LLMs for high accuracy. It enables seamless conversion between dialects (e.g., PostgreSQL โ†’ MySQL) with flexible access through Python API, command line, and web interface.


๐Ÿงช Task Definition

Goal: Translate SQL from one database dialect to another while preserving semantic equivalence.

Input:  (source_dialect, target_dialect, source_sql)
Output: target_sql

Example

-- Source (PostgreSQL)
SELECT EXTRACT(YEAR FROM created_at) AS year, COUNT(*) 
FROM users 
WHERE age > 25 
GROUP BY EXTRACT(YEAR FROM created_at);

-- Target (MySQL)
SELECT YEAR(created_at) AS year, COUNT(*) 
FROM users 
WHERE age > 25 
GROUP BY YEAR(created_at);

๐Ÿ“Š Benchmark Statistics

MetricCount
Translation Pairs598
Source Benchmarks38+
SQL Dialects10+
Supported Engines15+
Domain TypesSingle & Cross-domain

๐Ÿ“ฆ Benchmark Contents

PARROT/
โ”œโ”€โ”€ ๐Ÿ“ benchmark/          # Source datasets from 38+ benchmarks
โ”‚   โ”œโ”€โ”€ Spider/           # Cross-domain SQL queries
โ”‚   โ”œโ”€โ”€ SParC/            # Multi-turn conversations
โ”‚   โ”œโ”€โ”€ BIRD/             # Complex real-world queries
โ”‚   โ”œโ”€โ”€ TPC-H FROID/      # UDF-heavy workloads
โ”‚   โ””โ”€โ”€ ...               # 34+ more benchmarks
โ”œโ”€โ”€ ๐Ÿ” validator/         # Grammar parsers & validators
โ”‚   โ”œโ”€โ”€ pg_parser/        # PostgreSQL parser
โ”‚   โ”œโ”€โ”€ mysql_parser/     # MySQL parser
โ”‚   โ”œโ”€โ”€ oracle_parser/    # Oracle parser
โ”‚   โ””โ”€โ”€ ...               # 10+ more dialect parsers
โ”œโ”€โ”€ โš™๏ธ processor/         # Preprocessing utilities
โ”œโ”€โ”€ ๐Ÿ”„ translator/        # Baseline translation tools
โ””โ”€โ”€ ๐Ÿ“ค Submission_Example/ # Submission templates

Supported Benchmarks

View all 38+ benchmarks
BenchmarkYearSQL DialectsLanguageDomain TypeTurn RoundCollection
ATIS1994SQLite, MySQLEnglishSingle-domainSingleManual
GeoQuery1996MySQL, SQLiteEnglishSingle-domainSingleManual
Restaurants2000SQLiteEnglishSingle-domainSingleManual
Academic2014UnspecifiedEnglishSingle-domainSingleManual
IMDb2017UnspecifiedEnglishSingle-domainSingleManual
Yelp2017UnspecifiedEnglishSingle-domainSingleManual
Scholar2017UnspecifiedEnglishSingle-domainSingleManual
WikiSQL2017SQLiteEnglishCross-domainSingleManual
Advising2018SQLite, MySQLEnglishSingle-domainSingleManual
Spider2018SQLiteEnglishCross-domainSingleManual
SParC2019SQLiteEnglishCross-domainMultipleManual
CoSQL2019SQLiteEnglishCross-domainMultipleManual
CSpider2019SQLiteChineseCross-domainSingleManual
MIMICSQL2020SQLiteEnglishSingle-domainSingleHybridโ€ 
SQUALL2020SQLiteEnglishCross-domainSingleManual
FIBEN2020IBM Db2, PostgreSQLEnglishSingle-domainSingleManual
ViText2SQL2020General SQLVietnameseCross-domainSingleManual
DuSQL2020UnspecifiedChineseCross-domainSingleHybridโ€ 
PortugueseSpider2021SQLitePortugueseCross-domainSingleHybridโ€ 
CHASE2021SQLiteChineseCross-domainMultipleManual
Spider-Syn2021SQLiteEnglishCross-domainSingleManual
Spider-DK2021SQLiteEnglishCross-domainSingleManual
Spider-Realistic2021SQLiteEnglishCross-domainSingleManual
KaggleDBQA2021SQLiteEnglishCross-domainSingleManual
SEDE2021T-SQLEnglishSingle-domainSingleManual
MT-TEQL2021SQLiteEnglishCross-domainSingleAutomatic
PAUQ2022SQLiteRussianCross-domainSingleManual
knowSQL2022UnspecifiedChineseCross-domainSingleManual
Dr.Spider2023SQLiteEnglishCross-domainSingleHybridโ€ 
BIRD2023SQLiteEnglishCross-domainSingleManual
AmbiQT2023SQLiteEnglishCross-domainSingleLLM-aided
ScienceBenchmark2024General SQLEnglishSingle-domainSingleHybridโ€ 
BookSQL2024SQLiteEnglishSingle-domainSingleManual
Archer2024SQLiteEnglish/ ChineseCross-domainSingleManual
BULL2024SQLiteEnglish/ ChineseSingle-domainSingleManual
Spider22024SQLite, DuckDB, PostgreSQLEnglishCross-domainSingleManual
TPC-H FROID2018T-SQL, PostgreSQLEnglishCross-domainSingleHybridโ€ 
DSB2021T-SQL, PostgreSQLEnglishDecision SupportSingleHybridโ€ 
TPC-DS2005T-SQL, PostgreSQLEnglishDecision SupportSingleHybridโ€ 
SQL-ProcBench2021SQL Server, PostgreSQL, IBM Db2EnglishSingle-domainSingleProduction-derived

โ€  Hybrid means the dataset was created using both automatic generation and manual annotation.


๐Ÿงฎ Evaluation & Scoring

PARROT evaluates systems across four key dimensions:

DimensionDescription
๐Ÿ” Syntax ValidityCan the SQL be parsed by the target dialect?
โšก Execution ChecksResult equivalence when data available

๐Ÿ“š Citation

If you use PARROT in your research, please cite:

@inproceedings{zhou2025parrot,
  author       = {Wei Zhou and Guoliang Li and Haoyu Wang and Yuxing Han and Xufei Wu and Fan Wu and Xuanhe Zhou},
  title        = {PARROT: A Benchmark for Evaluating LLMs in Cross-System SQL Translation},
  booktitle    = {Advances in Neural Information Processing Systems (NeurIPS)},
  year         = {2025}
}

@article{zhou2025cracksql,
  author       = {Wei Zhou and Yuyang Gao and Xuanhe Zhou and Guoliang Li},
  title        = {Cracking SQL Barriers: An LLM-based Dialect Translation System},
  journal      = {Proceedings of the ACM on Management of Data},
  volume       = {3},
  number       = {3 (SIGMOD)},
  year         = {2025}
}

@article{zhou2025cracksqldemo,
  author       = {Wei Zhou and Yuyang Gao and Xuanhe Zhou and Guoliang Li},
  title        = {CrackSQL: A Hybrid SQL Dialect Translation System Powered by Large Language Models},
  journal      = {arXiv Preprint},
  url          = {https://arxiv.org/abs/2504.00882},
  year         = {2025}
}

๐Ÿ“„ License

This project is released under the MIT License. See LICENSE file for details.


๐Ÿ“ฌ Contact & Support

Questions? Feedback? Want to submit?

๐Ÿ“ง Email: weizhoudb@sjtu.edu.cn

๐Ÿ’ฌ Contributions: Issues and PRs are welcome!


๐Ÿ™ Acknowledgments

Made with โค๏ธ by

Shanghai Jiao Tong University โ€ข Tsinghua University โ€ข Bytedance Team


Star Fork Watch

โญ Star us on GitHub if you find this project useful!