Morpheus

October 3, 2020 · View on GitHub

This repository contains code for the paper "It's Morphin' Time! Combating Linguistic Discrimination with Inflectional Perturbations" (to be presented at ACL 2020).

Authors: Samson Tan, Shafiq Joty, Min-Yen Kan, and Richard Socher

UPDATE: pip installable library here!

Usage

To generate adversarial examples for one of the implemented models, run the corresponding run_morpheus_* script.

Custom tasks, datasets, or models

Morpheus can be easily implemented for a custom task, dataset, or model by following the structure of existing classes: MorpheusBase implements the methods common to all Morpheus implementations; Morpheus<Task> implements methods common to a specific task/dataset, Morpheus<Model><Task> implements methods specific to a particular model (usually the init and morph methods).

Generating adversarial training data

Use random_inflect/random_inflect.py to generate adversarial training data. You will need to pass in a dictionary of inflection counts for it to work in the weighted sampling mode, otherwise a uniform distribution will be used. The dictionary should be in the form

{
  "inflection tag": int,
}

E.g.,

{
  "VB": 150,
  "VBD": 100,
  ...
}

Adversarially fine-tuned models

Transformer-big for WMT'14 English-French: Compatible with fairseq
BERT-base for MNLI: Compatible with transformers
BERT-base for SQuAD 2: Compatible with transformers

Citation

@inproceedings{tan-etal-2020-morphin,
    title = "It{'}s Morphin{'} Time! {C}ombating Linguistic Discrimination with Inflectional Perturbations",
    author = "Tan, Samson  and
      Joty, Shafiq  and
      Kan, Min-Yen  and
      Socher, Richard",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.263",
    pages = "2920--2935",
}