Format.md

October 6, 2020 · View on GitHub

Quick Start: Format Description

  • The template of each line is as follows:
    [KEYWORD] ':' [VALUE] [OPTIONS]

Detailed Options Description

Options presented in case of the following keywords:

  1. Object;
  2. Attitude;
  3. FrameVariant.
For Object keyword
Object: 'сша' b:(3,1) oi:[1] si:{0} d:ner <AUTH>
  • b:(position, length) -- beginning position description.
    • position -- index of word, starts with 0;
    • length -- length of related object in words.
  • oi:[line_index] -- Object Index.
    • line_index -- an unsigned int value, is a reference to the Object;
  • si:{index} -- Synonym Index.
    • index -- index of synonym group, if synonym group exists; otherwise -1 value used.
  • d:type -- a method by which named entity has been extracted, where type denotes:
  • t:[TYPE] -- entity type:
    • TYPE -- could be one of the following: LOC, PER, ORG;
  • <AUTH> -- optional; if present, denotes that related object is belong a list of authorized object (see NOTE);

NOTE: List of authorized objects list presented in a form of relations, where each relation has the following format: source->target.

Object x is belong the list of authorised objects, when there is at least a single relation with source == x or target == x.

For Attitude keyword
Attitude: 'сша'->'украина' b:(1) oi:[1, 2] si:{0,180}
  • b:(score) -- Sentiment score of the related atittude.

    • score -- integer value; 1 -- positive, -1 -- negative.
  • oi:[source, target] -- Object Index.

    • source -- unsigned int value, is a reference to the Object;
    • target -- unsigned int value, is a reference to the Object;
  • si:{source, target} -- synonym indices of source and target.

    • source -- index of synonym group;
    • target -- index of synonym group;
For FrameVariant keyword
FrameVariant: помогать (6, 1) b:[a0->a1[pos]] id:(0_8)
  • (position, length) -- position description.

    • position -- index of word, starts with 0;
    • length -- length of related object in words.
  • b:[a0->a1[score]] -- sentiment score of a relation A0->A1 (according to RuSentiFrames). score parameter could have one of the following values:

    • pos -- denotes a positive sentiment
    • neg -- denotes a negative sentiment
  • id:(id) -- identifier in RuSentiFrames lexicon: