A Pytorch Implementation of R-BERT relation classification model

April 20, 2020 ยท View on GitHub

PWC PWC

This is an unofficial pytorch implementation of R-BERT model described paper Enriching Pre-trained Language Model with Entity Information for Relation Classification.

In addition to the SemEval 2010 dataset tested in the original paper, I aslo test implementation on the more recent TACRED dataset

Requirements:

Install

$ https://github.com/mickeystroller/R-BERT
$ cd R-BERT

Train

SemEval-2010

The SemEval-2010 dataset is already included in this repo and you can directly run:

CUDA_VISIBLE_DEVICES=0 python r_bert.py --config config.ini

TACRED

You need to first download TACRED dataset from LDC, which due to the license issue I cannot put in this repo. Then, you can directly run:

CUDA_VISIBLE_DEVICES=0 python r_bert.py --config config_tacred.ini

Eval

SemEval-2010

We use the official script for SemEval 2010 task-8

$ cd eval
$ bash test.sh
$ cat res.txt

TACRED

First, we generate prediction file tac_res.txt

$ python eval_tacred.py

You may change test file/model path in the eval_tacred.py file

Then, we use the official scoring script for TACRED dataset

$ python ./eval/score.py -gold_file <TACRED_DIR/data/gold/test.gold> -pred_file ./eval/tac_res.txt

Results

SemEval-2010

Below is the Macro-F1 score

ModelOriginal PaperOurs
BERT-uncased-base----88.40
BERT-uncased-large89.2590.16

TACRED

Below is the evaluation result

ModelPrecision (Micro)Recall (Micro)F1 (Micro)
BERT-uncased-base72.9962.5067.34
BERT-cased-base71.2764.8467.91
BERT-uncased-large72.9166.2069.39
BERT-cased-large70.8665.9668.32

Reference

  1. https://github.com/wang-h/bert-relation-classification

  2. Enriching Pre-trained Language Model with Entity Information for Relation Classification.