Fused: Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL
July 11, 2024 ยท View on GitHub
This repository contains code for the paper "Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL".
If you use Fused in your work, please cite it as follows:
@article{wang2024improving,
title={Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL},
author={Wang, Dingzirui and Dou, Longxu and Zhang, Xuanliang and Zhu, Qingfu and Che, Wanxiang},
journal={arXiv preprint arXiv:2402.10663},
year={2024}
}
Build Environment
conda create -n fused python=3.9 -y
conda activate fused
pip install requirements.txt
Download and put the Spider databases in ./dataset/Spider/database
Implement your openai-key in utils/generator.py if you want to use openai to generate demonstrations.
Synthesize Demonstrations
Run generate/slurm/generate.bash to synthesize with transformers or generate/slurm/generate.35turbo.bash with openai api.
The synthesized demonstartions are save in "./generate/examples/<model>/<scale>/Spider/<turn>/example.filt.json" in the following format:
[
...,
{
"reference": "List[Dict[str, Any]]: demonstrations used for fusing",
"table": "Dict[str, Any]: database used",
"query": "str: synthesized SQL query",
"question": "str: synthesized question"
},
...
]
Text-to-SQL
Use text_to_sql/preprocess.py to process the synthesized demonstrations into the demonstration pool format of ODIS.
Then you can use the ODIS to convert the user question into the SQL.
Evaluate
It is recommanded to evaluate the result with https://github.com/taoyds/test-suite-sql-eval.