Format.md
October 6, 2020 · View on GitHub
Quick Start: Format Description
- The template of each line is as follows:
[KEYWORD] ':' [VALUE] [OPTIONS]
-
Checkout sample;
-
Checkout reader example, written in Python.
Detailed Options Description
Options presented in case of the following keywords:
- Object;
- Attitude;
- FrameVariant.
For Object keyword
Object: 'сша' b:(3,1) oi:[1] si:{0} d:ner <AUTH>
- b:(
position,length) -- beginning position description.position-- index of word, starts with 0;length-- length of related object in words.
- oi:[
line_index] -- Object Index.line_index-- an unsigned int value, is a reference to the Object;
- si:{
index} -- Synonym Index.index-- index of synonym group, if synonym group exists; otherwise-1value used.
- d:
type-- a method by which named entity has been extracted, wheretypedenotes:ner-- entity has been found using NER tool (Bi-LSTM+CRF model) [paper];restored-- entity was missed by NER but restored using a list of authorized objects (see NOTE);
- t:[
TYPE] -- entity type:TYPE-- could be one of the following: LOC, PER, ORG;
<AUTH>-- optional; if present, denotes that related object is belong a list of authorized object (see NOTE);
NOTE: List of authorized objects list presented in a form of relations, where each relation has the following format:
source->target.
Object
xis belong the list of authorised objects, when there is at least a single relation withsource == xortarget == x.
For Attitude keyword
Attitude: 'сша'->'украина' b:(1) oi:[1, 2] si:{0,180}
-
b:(
score) -- Sentiment score of the related atittude.score-- integer value;1-- positive,-1-- negative.
-
oi:[
source,target] -- Object Index.source-- unsigned int value, is a reference to the Object;target-- unsigned int value, is a reference to the Object;
-
si:{
source,target} -- synonym indices ofsourceandtarget.source-- index of synonym group;target-- index of synonym group;
For FrameVariant keyword
FrameVariant: помогать (6, 1) b:[a0->a1[pos]] id:(0_8)
-
(
position,length) -- position description.position-- index of word, starts with 0;length-- length of related object in words.
-
b:[a0->a1[
score]] -- sentiment score of a relationA0->A1(according to RuSentiFrames).scoreparameter could have one of the following values:pos-- denotes a positive sentimentneg-- denotes a negative sentiment
-
id:(
id) -- identifier in RuSentiFrames lexicon:id-- key in json dictionary of RuSentiFrames lexicon;