Neural Networks Applications in Sentiment Attitude Extraction
October 13, 2022 · View on GitHub
UPD January 10th, 2021: These scripts mostly become a part of AREkit-0.22.0 demo and examples! [demo-readme]
This repository is an application for neural-networks of AREkit framework, devoted to sentiment attitude extraction task [initial-paper], applied for a document contexts:

Figure: Example of a context with attitudes mentioned in it; named entities «Russia» and «NATO» have the negative attitude towards each other with additional indication of other named entities.
It provides applications for:
- Data serialization;
- Training neural networks for the following models list.
Models List
- Aspect-based Attentive encoders:
- Multilayer Perceptron (MLP) [code] / [github:nicolay-r];
- Self-based Attentive encoders:
- P. Zhou et. al. [code] / [github:SeoSangwoo];
- Z. Yang et. al. [code] / [github:ilivans];
- Single Sentence Based Architectures:
- CNN [code] / [github:roomylee];
- CNN + Aspect-based MLP Attention [code];
- PCNN [code] / [github:nicolay-r];
- PCNN + Aspect-based MLP Attention [code];
- RNN (LSTM/GRU/RNN) [code] / [github:roomylee];
- IAN (frames based) [code] / [github:lpq29743];
- RCNN (BiLSTM + CNN) [code] / [github:roomylee];
- RCNN + Self Attention [code];
- BiLSTM [code] / [github:roomylee];
- Bi-LSTM + Aspect-based MLP Attention [code]
- Bi-LSTM + Self Attention [code] / [github:roomylee];
- RCNN + Self Attention [code];
- Multi Sentence Based Encoders Architectures:
Dependencies
- Python-2.7
- AREKit == 0.20.5
Installation
AREkit repository:
# Clone repository in local folder of the currect project.
git clone -b 0.20.5-rc https://github.com/nicolay-r/AREkit ../arekit
# Install dependencies.
pip install -r arekit/requirements.txt
Prepare the data
We utilize RusVectores news-2015 embedding:
mkdir -p data
curl http://rusvectores.org/static/models/rusvectores2/news_mystem_skipgram_1000_20_2015.bin.gz -o "data/news_rusvectores2.bin.gz"
Application #1. Data Serialization
Using run_serialization.sh in order to prepare data for a particular experiment:
python run_serialization.py
--cv-count 3 --frames-version v2_0
--experiment rsr+ra --labels-count 3 --ra-ver v1_0
--emb-filepath data/news_rusvectores2.bin.gz
--entity-fmt rus-simple --balance-samples True
Application #2. Training
Using run_train_classifier.sh to run an experiment.
CUDA_VISIBLE_DEVICES=0 python run_training.py --do-eval
--bags-per-minibatch 32 --dropout-keep-prob 0.80 --cv-count 3
--labels-count 3 --experiment rsr+ra --model-input-type ctx --ra-ver v1_0
--model-name cnn --test-every-k-epoch 5 --learning-rate 0.1
--balanced-input True --train-acc-limit 0.99 --epochs 100
Script Arguments Manual
Common flags:
--experiment-- is an experiment which could be as follows:rsr-- supervised learning + evaluation within RuSentRel collection;ra-- pretraining with RuAttitudes collection;rsr+ra-- combined training within RuSentRel and RuAttitudes and evalut.
--cv_count-- data folding mode:1-- predefined docs separation onto TRAIN/TEST (RuSentRel);k-- CV-based folding ontok-folds; (k=3supported);
--frames_versions-- RuSentiFrames collection version:v2.0-- RuSentiFrames-2.0;
--ra_ver-- RuAttitudes version, if collection is applicable (raorrsr+raexperiments):v1_2-- RuAttitudes-1.0 paper;v2_0_base;v2_0_large;v2_0_base_neut;v2_0_large_neut;
Training specific flags:
--model_name-- model to train (see [list]);--do_eval-- activates evaluation during training process;--bags_per_minibatch-- количество мешков в мини-партии;--balanced_input-- флаг, указывает на использование сбалансированной коллекции в обучении модели;--emb-filepath-- path to Word2Vec model;--entity-fmt-- entities formatting type:rus-simple-- using russian masks:объект,субъект,сущость;sharp-simple-- using BERT related notation for meta tokens:#O(object),#S(subjects),#E(entities);
--balance-samples-- activates sample balancing;