curate.md
January 24, 2025 ยท View on GitHub
๐ณ Dataset Curation
How our dataset was curated? How to create the benchmark instances?
To obtain testable real-world repositories from GitHub, we propose a fully automated curation pipeline that utilizes GitHub Actions CI and LLM assistance, eliminating the need for human involvement in benchmark construction.
Github crawling
python -m dibench.curate.crawling --help
-
Searches GitHub for repositories in
star_rangeforlanguage(10-star batches). -
Check each repo for workflows, if found, dump repo instance into JSONL.
Test CI locating
python -m dibench.curate.curate --help
- Locate the test CI file
- Locate the test job in the CI file
- Get the ACT command
- Sanitize & Mask
- Get the gold patch
Execution verifying
python -m dibench.curate.verify --help
Expected:
- Tests Pass when dependencies unmasked
- Tests Fail when dependencies masked