Chinese Entity Tagging

July 27, 2020 · View on GitHub

Background

Entity tagging identifies pieces of text (“mentions”) and marks them with types such as Person, Organization, Geo-political Entity, Location, etc. In addition to proper names (“Bob”), mentions may also include nominals (“the player”).

Example

Input:

美国国防部长马蒂斯说，与首尔举行的名为“秃鹫”的军事演习每年春天在韩国进行，但2019年将“缩小规模”。

Output:

[美国]GPE国防部长[马蒂斯]PER说，与[首尔]GPE举行的名为“秃鹫”的军事演习每年春天在[韩国]GPE进行，但[2019年]TMP将“缩小规模”。

Standard Metrics

F-score for selecting correct piece of text (“mention”) and assigning the correct type.

TAC-KBP / EDL Track (2015-2017).

The NIST TAC Knowledge Base Population (KBP) Entity Discovery and Linking (EDL) track includes Chinese entity tagging for 5 types: person (PER), geo-political entity (GPE), location (LOC), organization (ORG) and facility (FAC).

Shared task sites: http://nlp.cs.rpi.edu/kbp/2017 (likewise for 2015 and 2016)
Shared task writeups: http://nlp.cs.rpi.edu/paper/kbp2017.pdf (likewise for 2015 and 2016)
Note that KBP-EDL 2018 includes thousands of entity types, from the YAGO ontology, but test data is provided only in English.

Data for this evaluation is available from the Linguistic Data Consortium (LDC).

Test set	Size (documents)	Genre
TAC-KBP-EDL 2015	313 (train + eval)	News
TAC-KBP-EDL 2016	166	News
TAC-KBP-EDL 2017	167	News

Metrics

NERC F-score

Requires identifying both text-span and type of entity mention
2016 and 2017 tasks includes both name and nominal mentions
Scoring code: http://nlp.cs.rpi.edu/kbp/2017/scoring.html (likewise for 2015 ad 2016)

Results

System	TAC-KBP / EDL 2015 Names	TAC-KBP / EDL 2016 Names and nominals	TAC-KBP / EDL 2017 Names and nominals
Best anonymous system in shared task writeup	79.9	80.8	72.2

Resources

Ontonotes 5.0 (https://catalog.ldc.upenn.edu/LDC2013T19) from the Linguistic Data Consortium includes Chinese entity tagging.

698 articles Xinhua (1994-1998)
55 articles Information Services Department of HKSAR (1997)
132 articles Sinorama magazine, Taiwan (1996-1998 & 2000-2001)

ACE 2005.

ACE 2005 evaluates on seven entity types: Facility (FAC), Geopolitical Entity (GPE), Location (LOC), Organization (ORG), Person (PER), Vehicle (VEH), and Weapon (WEA).

Data for this evaluation was prepared by the Linguistic Data Consortium (LDC).

https://catalog.ldc.upenn.edu/LDC2006T06

A standard train/dev/test split does not seem to be available. Authors frequently split randomly 8:1:1 (Ju et. al. 2018).

Train + test set	Size (characters)	Genre
ACE 2005	325,834	Newswire, Broadcast News, Weblog

Results

System	F-score
Wang et al (2020)	81.7
Huang et al (2020)	81.7
Wang & Lu. (2018)	73.00
Ju et. al. (2018)	72.25

SIGHAN bakeoff 2006 NER MSRA.

This bakeoff evaluates entity taggers on three types of entities: Person (PER), Location (LOC), and Organization (ORG).

Paper summarizing the bakeoff:

Levow (2006)

Test set	Size (words)	Genre
SIGHAN 2006 NER MSRA	100,000	Newswire, Broadcast News, Weblog

Results

System	F-score
Liu et al (2020)	95.7
Meng et. al. (2019)	95.5
Ma et al (2020)	95.4
Sun et al (2020)	95.0
Yan et al (2020)	94.1
Liu et. al. (2019)	93.74
Sui et al. (2019)	93.47
Gui et al. (2019)	93.46
Zhang & Yang (2018)	93.18

Resources

The “closed” task restricts participants to use only the following training material:

Train set	Size (words)	Genre
SIGHAN 2006 NER MSRA	1.3M	Newswire, Broadcast News, Weblog

Weibo NER.

This social media entity tagging task includes GPE, ORG, LOC, and PER. It was introduced by

Peng & Dredze (2015)

Using the test split by http://www.aclweb.org/anthology/E17-2113:

Test set	Size (name mentions)	Size (nomial mentions)	Genre
Weibo NER	209	196	Social media (Weibo)

Results

System	F-score (name mentions)	F-score (nominal mentions)	F-score (Overall)
Ma et al (2020)	70.9	67.0	70.5
Meng et. al. (2019)	67.6
Hu and Zheng (2020)	56.4
Sui et al. (2019)	56.45	68.32	63.09
Gui et al. (2019)	55.34	64.98	60.21
Liu et. al. (2019)	52.55	67.41	59.84
Zhu (2019)	55.38	62.98	59.31
Zhang & Yang (2018)	53.04	62.25	58.79
Peng & Dredze (2015)	55.28	62.97	58.99

Resources

Train & Dev data	Size (name mentions)	Size (nominal mentions)	Genre
Weibo NER train	--	--	Social media (Weibo)
Weibo NER dev	153	226	Social media (Weibo)

Also included are 112M unlabeled text Weibo messages.

Other Resources

This paper presents an NER-annotated corpus in the genres of social media, human-computer interaction, and e-commerce:

Lu et. al. (2018)

Suggestions? Changes? Please send email to chinesenlp.xyz@gmail.com