A Pytorch Implementation of R-BERT relation classification model

April 20, 2020 · View on GitHub

In addition to the SemEval 2010 dataset tested in the original paper, I aslo test implementation on the more recent TACRED dataset

Requirements:

$ https://github.com/mickeystroller/R-BERT
$ cd R-BERT

The SemEval-2010 dataset is already included in this repo and you can directly run:

CUDA_VISIBLE_DEVICES=0 python r_bert.py --config config.ini

You need to first download TACRED dataset from LDC, which due to the license issue I cannot put in this repo. Then, you can directly run:

CUDA_VISIBLE_DEVICES=0 python r_bert.py --config config_tacred.ini

We use the official script for SemEval 2010 task-8

$ cd eval
$ bash test.sh
$ cat res.txt

First, we generate prediction file tac_res.txt

$ python eval_tacred.py

You may change test file/model path in the eval_tacred.py file

Then, we use the official scoring script for TACRED dataset

$ python ./eval/score.py -gold_file <TACRED_DIR/data/gold/test.gold> -pred_file ./eval/tac_res.txt

Below is the Macro-F1 score

Model	Original Paper	Ours
BERT-uncased-base	----	88.40
BERT-uncased-large	89.25	90.16

Below is the evaluation result

Model	Precision (Micro)	Recall (Micro)	F1 (Micro)
BERT-uncased-base	72.99	62.50	67.34
BERT-cased-base	71.27	64.84	67.91
BERT-uncased-large	72.91	66.20	69.39
BERT-cased-large	70.86	65.96	68.32