ACL 2025 (SAC Highlights) - AntiLeakBench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge
January 25, 2026 ยท View on GitHub
This repo contains the data and code of our work AntiLeakBench.
We have provided the used test samples at ./releases.
Benchmark Building Workflow
Install the requirements:
ujson
pyyaml-include==1.3.2
# The below requirements are for LLM evaluation. Ignore them if only building benchmarks.
torch==2.4.0
transformers==4.43.2
pyyaml-include==1.3.2
einops==0.8.0
accelerate==0.33.0
protobuf==3.20.0
sentencepiece==0.2.0
flash_attn==2.6.3
fastchat==0.1.0
Follow the steps below to build a benchmark:
-
Download a Wikidata dump.
wget https://dumps.wikimedia.org/wikidatawiki/entities/latest-all.json.bz2 -P raw_datalatest-all.json.bz2is the latest Wikidata dump. More dumps can be found at Wikidata.We note that in our paper we use the dump
wikidata-20240805-all.json.bz2, but it's inaccessible now since Wikidata regularly cleans up old dumps. Thus, the produced test samples withlatest-all.json.bz2may differ slightly from those at./releaseswithwikidata-20240805-all.json.bz2. -
Extract claims, relations, and qualifiers from the Wikidata dump.
./scripts/process_rawdata.sh ./raw_data/latest-all.json.bz2This step takes about 15 hours.
-
Construct test samples.
./scripts/build.sh ./raw_data/latest-all.json.bz2 ./data 2022-01-01 2023-01-01The constructed samples will be under
./data/en_2022-01-01_2023-01-01.
Evaluate LLMs
We provide a shell script to evaluate LLMs. For example,
./scripts/run.sh ./releases/en_20220101_20230101/singlehop-gold.json ./configs/llama-2-7b-chat.yaml
Contact
- We welcome your contributions to this project. Please feel free to submit pull requests.
- If you encounter any issues, please either directly contact Xiaobao Wu (xiaobao002@e.ntu.edu.sg) or leave an issue in the GitHub repo.
Citation
@inproceedings{wu2025antileak,
title = "{A}nti{L}eak{B}ench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge",
author = "Wu, Xiaobao and Pan, Liangming and Xie, Yuxi and Zhou, Ruiwen and Zhao, Shuai and Ma, Yubo and Du, Mingzhe and Mao, Rui and Luu, Anh Tuan and Wang, William Yang",
booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = jul,
year = "2025",
address = "Vienna, Austria",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.acl-long.901/",
doi = "10.18653/v1/2025.acl-long.901",
pages = "18403--18419",
ISBN = "979-8-89176-251-0"
}