Probability Weighted Word Saliency(PWWS)

July 28, 2019 · View on GitHub

This repository contains Keras implementations of the ACL2019 paper Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency.

Overview

  • data_set/aclImdb/ , data_set/ag_news_csv/anddata_set/yahoo_10 are placeholder directories for the IMDB Review, AG's News and Yahoo! Answer, respectively.
  • word_level_process.pyandchar_level_process.py contain two different prepressing methods of dataset for word-level and char-level, respectively.
  • neural_networks.py contain implementations of four neural networks(word-based CNN, Bi-directional LSTM, char-based CNN, LSTM) used in paper.
  • Use training.pyto train four NN in neural_networks.py.
  • fool.py, evaluate_word_saliency.py, get_NE_list.py,adversarial_tools.pyandparaphrase.pybuild the experiment pipeline.
  • Use evaluate_fool_results.py to evaluate classification accuracy and word replacement rate of adversarial examples generated by PWWS.

Dependencies

  • Python 3.7.1.
  • Versions of all depending libraries are specified in requirements.txt. To reproduce the reported results, please make sure that the specified versions are installed.
  • If you did not download WordNet(a lexical database for the English language), use nltk.download('wordnet') to do so.(Cancel the code comment on line 14 in paraphrase. py)

Usage

  • Download dataset files from google drive , which include
    • IMDB: aclImdb.zip. Decompression and place the folderaclImdb indata_set/.
    • AG's News: ag_news_csv.zip. Decompression and place the folder ag_news_csv indata_set/.
    • Yahoo Answers: yahoo_10.zip. Decompression and place the folder yahoo_10 indata_set/.
  • Download glove.6B.100d.txtfrom google drive and place the file in /.
  • Run training.py or use command likepython3 training.py --model word_cnn --dataset imdb --level word. You can reset the model hyper-parameters in neural_networks.py and config.py.Note that neither this repository nor the paper provides an implementation of char_cnn on IMDB and Yahoo! Answers datasets.
  • Run fool.py or use command likepython3 fool.py --model word_cnn --dataset imdb --level wordto generate adversarial examples using PWWS.
  • Runevaluate_fool_reaults.pyto evaluate adversarial examples.
  • If you want to train or fool different models, reset the argument in training.pyandfool.py.

Result on pretrained model

runs/contains some pretrained NN models, the information of these models are showed as the following table.

We use these pretrained models to generate 1000 adversarial examples with PWWS.

  • test_set means classification accuracy on test set.
  • clean_1000 means classification accuracy on the 1000 clean samples(from test set).
  • adv_1000 means classification accuracy on the adversarial examples corresponding to the 1000 clean samples.
  • sub_rate means word replacement rate defined in Section 4.4.
  • NE_rate means (number of NEadvNE_{adv})/(number of substitute word).

If you want to use this model, rename the them or modify the paths to model in the .py files.

data_setneural_networktest_setclean_1000adv_1000sub_rateNE_rate
IMDBword_cnn88.792%86.2%5.7%3.933%21.395%
word_bdlstm87.472%86.8%2.0%4.206%11.094%
word_lstm88.420%89.8%10.4%6.816%6.548%
AG's Newsword_cnn90.526%89.0%13.2%12.308%30.877%
word_bdlstm90.711%89.3%12.9%13.494%27.227%
word_lstm91.829%91.4%18.1%18.102%27.374%
char_cnn88.224%88.5%20.0%11.979%23.241%
Yahoo! Answersword_cnn88.427%96.1%8.7%33.067%12.768%
word_bdlstm88.876%94.4%9.4%20.752%7.016%

Contact

  • If you have any questions regarding the code, please create an issue or contact the owner of this repository.

Acknowledgments