README.md
April 1, 2026 ยท View on GitHub
FGVEdit
Official code and data release for
Visual-Oriented Fine-Grained Knowledge Editing for Multimodal Large Language Models
Table of Contents
๐ ๏ธ About This Project
FGVEdit is a benchmark and codebase for fine-grained multimodal knowledge editing. The current release contains data processing and experiment entrypoints for four editing methods:
- IKE
- MEND
- SERAC
- MSCKE
The released code currently supports two multimodal backbones:
- BLIP-2
- MiniGPT-4
The default data split used by this repository is:
train_data.json: 8334 samplestest_data.json: 2778 samples
Each sample contains the edit target, rephrase query, textual locality query, fine-grained generality query, fine-grained locality query, and the associated image path.
๐ Getting Started
Download Data
The released dataset is available at ZhenZeng/FGVEdit.
You can download it with Hugging Face CLI:
huggingface-cli download ZhenZeng/FGVEdit --repo-type dataset --local-dir ./tmp/FGVEdit_download
After download, arrange the files into the layout expected by the current code:
data/
โโโ FGVEdit/
โ โโโ train_data.json
โ โโโ test_data.json
โโโ images/
โโโ train2014/
โโโ val2014/
The dataset loader reads:
data/FGVEdit/train_data.jsondata/FGVEdit/test_data.json- image paths under
data/images/
The image field inside each JSON record is a relative path such as train2014/xxx.jpg or val2014/xxx.jpg, so the image root must be data/images.
Environment Setup
This repository currently provides a single Python dependency file: requirements.txt.
We recommend using conda:
conda create -n fgvedit python=3.10 -y
conda activate fgvedit
python -m pip install --upgrade pip
python -m pip install -r requirements.txt
Download Pre-trained Models
The current hparams/ files reference pre-trained model caches and checkpoints with the following layout:
hugging_cache/
โโโ all-MiniLM-L6-v2/
โโโ bert-base-uncased/
โโโ clip-vit-large-patch14/
โโโ distilbert-base-cased/
โโโ opt-2.7b/
โโโ opt-125m/
โโโ Vicuna/
โ
โโโ blip2_pretrained_flant5xxl.pth
โโโ blip2_pretrained_opt2.7b.pth
โโโ eva_vit_g.pth
โโโ pretrained_minigpt4_7b.pth
Links are in the following:
๐งช Usage
All experiment entrypoints are at the repository root:
edit_IKE.pyedit_MEND.pyedit_SERAC.pyedit_MSCKE.py
All hyper-parameters are stored in hparams/.
IKE
Configs:
hparams/IKE/blip2.yamlhparams/IKE/minigpt4.yaml
Run embedding generation only:
python edit_IKE.py --model blip2 --mode embed
python edit_IKE.py --model minigpt4 --mode embed
Run evaluation:
python edit_IKE.py --model blip2 --mode eval
python edit_IKE.py --model minigpt4 --mode eval
The script will build train-set embeddings and then evaluate on data/FGVEdit/test_data.json.
MEND
Training configs:
hparams/TRAINING/MEND/blip2.yamlhparams/TRAINING/MEND/minigpt4.yaml
Evaluation configs:
hparams/MEND/blip2.yamlhparams/MEND/minigpt4.yaml
Run training:
python edit_MEND.py --model blip2 --mode train
python edit_MEND.py --model minigpt4 --mode train
Run evaluation:
python edit_MEND.py --model blip2 --mode eval
python edit_MEND.py --model minigpt4 --mode eval
Important:
--mode evalrequires a trained checkpoint.- Before evaluation, update the
archivefield in the chosen evaluation YAML so that it points to a concrete.ptcheckpoint file produced during training. - Training outputs are saved under
results_dir/models/<ALG>/.
SERAC
Training configs:
hparams/TRAINING/SERAC/blip2.yamlhparams/TRAINING/SERAC/minigpt4.yaml
Evaluation configs:
hparams/SERAC/blip2.yamlhparams/SERAC/minigpt4.yaml
Run training:
python edit_SERAC.py --model blip2 --mode train
python edit_SERAC.py --model minigpt4 --mode train
Run evaluation:
python edit_SERAC.py --model blip2 --mode eval
python edit_SERAC.py --model minigpt4 --mode eval
Important:
--mode evalrequires a trained checkpoint.- Before evaluation, update the
archivefield in the chosen evaluation YAML to the actual checkpoint file path.
MSCKE
Training configs:
hparams/TRAINING/MSCKE/blip2.yamlhparams/TRAINING/MSCKE/minigpt4.yaml
Evaluation configs:
hparams/MSCKE/blip2.yamlhparams/MSCKE/minigpt4.yaml
Run training:
python edit_MSCKE.py --model blip2 --mode train
python edit_MSCKE.py --model minigpt4 --mode train
Run evaluation:
python edit_MSCKE.py --model blip2 --mode eval
python edit_MSCKE.py --model minigpt4 --mode eval
Important:
--mode evalrequires a trained checkpoint.- Before evaluation, update the
archivefield in the chosen evaluation YAML to the actual checkpoint file path produced during training.
Practical Notes
- The current scripts read
devicefrom the selected YAML file, so change that field before running on your machine. - If you move data or checkpoints, update the corresponding paths in the YAML file instead of relying on implicit defaults.
- MEND, SERAC, and MSCKE evaluation configs currently contain placeholder
archivepaths. They must be replaced with real checkpoint filenames.
๐ Acknowledgments
This repository builds on the multimodal editing ecosystem around EasyEdit, and uses pretrained components or model implementations from LAVIS / BLIP-2, MiniGPT-4, Transformers, Sentence-Transformers, and CLIP.
We thank the authors and maintainers of these projects for making their code and models publicly available.