Improved Word Sense Disambiguation with Enhanced Sense Representations
February 24, 2022 ยท View on GitHub
This repository contains codes and scripts to build enhanced sense representations for word sense disambiguation.
If you use this code for your work, please cite this paper:
@inproceedings{song-etal-2021-improved-word,
title = "Improved Word Sense Disambiguation with Enhanced Sense Representations",
author = "Song, Yang and
Ong, Xin Cai and
Ng, Hwee Tou and
Lin, Qian",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
year = "2021",
url = "https://aclanthology.org/2021.findings-emnlp.365",
pages = "4311--4320"
}
Requirements
- python==3.8.8
- pytorch==1.9.0
- transformers==4.6.1
- nltk==3.6.2
Downloading Datasets
You need to download the following datasets:
Setting up variables
You need to modify script/config.sh according to your environment.
Set data variable to the top directory where all the datasets are stored.
Processing FEWS
bash experiment/fews/run.sh
Using trained models
You can train the models from scratch. Alternatively, you can use our trained models.
Running Experiments
For ESR on SemCor with roberta-base:
bash experiment/esr/roberta-base/dataset_semcor/sd_42/run.sh
For ESR on SemCor with roberta-large:
bash experiment/esr/roberta-large/dataset_semcor/sd_42/run.sh
For ESR on SemCor and WNGC with roberta-base:
bash experiment/esr/roberta-base/dataset_semcor_wngc/sd_42/run.sh
For ESR on SemCor and WNGC with roberta-large:
bash experiment/esr/roberta-large/dataset_semcor_wngc/sd_42/run.sh
For ESR on FEWS with roberta-base:
bash experiment/esr/roberta-base/dataset_fews/sd_42/run.sh
For ESR on FEWS with roberta-large:
bash experiment/esr/roberta-large/dataset_fews/sd_42/run.sh