XiYan-SQL

May 5, 2026 ยท View on GitHub

This is the new official Alibaba repository for XiYan-SQL, and the previous collection can be accessed here.

Latest News๐Ÿ”ฅ

  • Nov. 21, 2025 ๐ŸŒŸ New SOTA on BIRD-CRITIC-Open: XiYan-SQL-CRITIC technique has achieved an impressive 44.37% success rate on BIRD-CRITIC-Open, a highly challenging multi-dialect benchmark, securing the top position with SOTA performance!

  • Oct. 30, 2025 ๐ŸŒŸ We are excited to release the XiYan-SQL training framework XiYan-SQLTraining !!! This framework is primarily designed for the training of SQL/general LLMs and includes capabilities such as SQL data processing, model training, and evaluation as proposed by XiYan. We will continue to enhance the framework in the future.

  • Oct. 20, 2025 ๐ŸŒŸ New SOTA on BIRD-CRITIC: XiYan-SQL-CRITIC technique has achieved a remarkable 44.53% success rate on the BIRD-CRITIC-PG benchmark, securing the top position with SOTA performance! Additionally, it recorded an impressive 48.5% success rate on the BIRD-CRITIC-Flash benchmark, also establishing a new SOTA performance.

  • Oct. 20, 2025 ๐ŸŒŸ The training framework of XiYan-SQL, XiYan-SQLTraining, will soon be released in this official Alibaba repository. Stay tuned!

  • ...

๐Ÿš€ Join Our Team

Our team is hiring interns and class-of-2027 graduates in Deep Research, LLM post-training, AI Agents, and NL2SQL. With solid technical expertise and cutting-edge algorithm R&D projects, we encourage academic research and top-tier conference publications. Candidates passionate about LLMs are welcome to send their resumes to zhencang.lyf@alibaba-inc.com.

Introduction

XiYan-SQL is an innovative natural language to SQL conversion framework designed to address the performance challenges of large language models in SQL generation tasks. This framework introduces a multi-generator ensemble strategy, enhancing SQL generation capabilities by integrating various SQL LLMs. XiYan-SQL employs multi-task and multi-format training strategies, producing high-quality and diverse SQL models. It also incorporates algorithms such as M-Schema, Schema Filter, and SQL candidate selection to achieve optimal SQL generation performance.

XiYan-SQL has achieved top rankings in several internationally recognized benchmarks, including BIRD-2023, BIRD-Critic, and Spider, demonstrating its robustness and effectiveness across different scenarios.

For developers, XiYan-SQL offers multiple models and corresponding source code, facilitating further research and application. Contributions to the XiYan-SQL project are welcome!

Timeline

The major events.

DateEvent
2025-11XiYan-SQL-CRITIC technique has achieved an impressive 44.37% success rate on BIRD-CRITIC-Open, a highly challenging real-world multi-dialect benchmark, securing the top position with SOTA performance!
2025-10We are excited to release the XiYan-SQL training framework XiYan-SQLTraining !!! This framework is primarily designed for the training of SQL/general LLMs and includes capabilities such as SQL data processing, model training, and evaluation as proposed by XiYan. We will continue to enhance the framework in the future.
2025-10XiYan-SQL-CRITIC technique has achieved a remarkable 44.53% success rate on the BIRD-CRITIC-PG benchmark, securing the top position with SOTA performance! Additionally, it recorded an impressive 48.5% success rate on the BIRD-CRITIC-Flash benchmark, also establishing a new SOTA performance.
2025-09The download count for the XiYanSQL-QwenCoder series models on ModelScope has exceeded 100k , making it the most influential SQL model in the field.
2025-05XiYanSQL-CRITIC algorithm achieves a 41% Pass Rate score on the BIRD-CRITIC-Flash benchmark, setting a new SOTA performance.
2025-04We have released version 2504 of the XiYanSQL-QwenCoder series models, which features enhanced performance compared to the previous version. It still includes four different parameter sizes: 3B, 7B, 14B, and 32B. We encourage everyone to utilize these models.
2025-02We have released the XiYanSQL-QwenCoder series model, which includes four different sizes: 3B, 7B, 14B, and 32B parameters, to meet the needs of different developers.
XiYanSQL-QwenCoder-32B has been released
2025-01XiYanSQL-QwenCoder-32B achieves an EX score of 69.03% on BIRD test, new SOTA using only single fine-tuned model
2024-12Reaching the top of Bird leaderboard with an EX score of 75.63% and R-VES of 71.41(new SOTA)
2024-11Proposing XiYanSQL technology A Multi-Generator Ensemble Framework for Text-to-SQL
Achieving 41.20% on NL2GQL, and a competitive score of 72.23% on Bird dev (bird)
Achieving 89.65% on Spider test set (new SOTA), 69.86% on SQL-Eval (new SOTA)
2024-10Proposing an SQL MoE model MoMQ
2024-09Proposing DateSolver module
2024-05Proposing M-schema, involving ICL in SQL generation
Achieving 86.98% on Spider test set (SOTA 86.6%)

XiYan-SQL Collection

...

Application

Welcome everyone to try the intelligent data querying solution based on XiYan-SQL, which is called XiYan GBI. We welcome any product experiences and suggestions for optimization.

For product introduction, please visit: https://help.aliyun.com/zh/model-studio/user-guide/brief-introduction-of-gbi-products

To try the product, please visit: https://bailian.console.aliyun.com/xiyan

Product DingTalk Group: 94725009401

Contact us:

If you are interested in our research or products, please feel free to contact us.

Contact Information:

Yifu Liu, zhencang.lyf@alibaba-inc.com

Join Our DingTalk Group

Ding Group้’‰้’‰็พค

Star History

Star History Chart

Citation

If you find our work helpful, feel free to give us a cite.

@ARTICLE{XiYanSQL,
  author={Liu, Yifu and Zhu, Yin and Gao, Yingqi and Luo, Zhiling and Li, Xiaoxia and Shi, Xiaorong and Hong, Yuntao and Gao, Jinyang and Li, Yu and Ding, Bolin and Zhou, Jingren},
  journal={IEEE Transactions on Knowledge and Data Engineering}, 
  title={XiYan-SQL: A Novel Multi-Generator Framework for Text-to-SQL}, 
  year={2026},
  volume={},
  number={},
  pages={1-14},
  doi={10.1109/TKDE.2026.3657851}}
@article{xiyansql_pre,
      title={A Preview of XiYan-SQL: A Multi-Generator Ensemble Framework for Text-to-SQL}, 
      author={Yingqi Gao and Yifu Liu and Xiaoxia Li and Xiaorong Shi and Yin Zhu and Yiming Wang and Shiqi Li and Wei Li and Yuntao Hong and Zhiling Luo and Jinyang Gao and Liyu Mou and Yu Li},
      year={2024},
      journal={arXiv preprint arXiv:2411.08599},
      url={https://arxiv.org/abs/2411.08599},
      primaryClass={cs.AI}
}