Neural GEC Systems with Unsupervised Pre-Training on Synthetic Data
September 7, 2019 ยท View on GitHub
This repository contains models, system configurations and outputs of our winning GEC systems in the BEA 2019 shared task described in R. Grundkiewicz, M. Junczys-Dowmunt, K. Heafield: Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data, BEA 2019.
Citation
@inproceedings{grundkiewicz-etal-2019-neural,
title = "Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data",
author = "Grundkiewicz, Roman and
Junczys-Dowmunt, Marcin and
Heafield, Kenneth",
booktitle = "Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications",
month = aug,
year = "2019",
address = "Florence, Italy",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/W19-4427",
pages = "252--263",
}
Content
systems- original GEC systems developed for and submitted to the restricted and low-resource tracksoutputs- corrected output and evaluation scores for common GEC test setstraining- updated training scripts re-producing our GEC system
See README files in each subdirectory for more information. In case of any questions, please open an issue or send me (Roman) an email.