InterpretationFragility

May 1, 2024 · View on GitHub

Code for implementation of Interpretation of Nueral Network is Fragile..

Please cite the following work if you use this benchmark or the provided tools or implementations:

@inproceedings{ghorbani2019interpretation,
  title={Interpretation of neural networks is fragile},
  author={Ghorbani, Amirata and Abid, Abubakar and Zou, James},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={33},
  pages={3681--3688},
  year={2019}
}

Authors

Amirata Ghorbani - Website
Abubakar Abid - Website
James Zou - Website

License

This project is licensed under the MIT License - see the LICENSE.md file for details

The large scale results of attack methods against four famous feature-attribution methods

alt text

Examples of targeted attack for semantically meaningful change in feature-importance

alt text

Attack examples on Deep Taylor Decomposition

alt text