MALMEN: MAssive Language Model Editing Network
January 26, 2024 ยท View on GitHub
This is the official repo of our ICLR'24 paper Massive Editing for Large Language Models via Meta Learning.
You can email chenmien.tan@ed.ac.uk for any issue.
Setup
You can create a virtual environment and install the dependencies via Anaconda.
$ conda create -n malmen
$ conda activate malmen
(malmen)$ pip install -r requirements.txt
The datasets for all experiments presented in the manuscript are available at this Google Drive link.
You need to specify the paths to the json files in config.data.train_path and config.data.valid_path.
You should also specify an empty folder in config.editor.cache_dir to store cache files generated during running the code.
Running
You can set all hyper-parameters via modifying the yaml files in the folder config.
You should run the code by executing the main.py file.
You can also specify the hyper-parameters on the command line.
(malmen)$ python main.py \
data=zsre \
model=gpt-j \
editor=malmen
Acknowledgement
We thank the implementation of MEND and MEMIT, which inspires some code in this repo.
Citation
@inproceedings{tan23malmen,
title={Massive Editing for Large Language Models via Meta Learning},
author={Chenmien Tan and Ge Zhang and Jie Fu},
booktitle={International Conference on Learning Representations},
year={2024},
url={https://openreview.net/pdf?id=L6L1CJQ2PE}
}