RunBugRun

May 23, 2025 ยท View on GitHub

Note


This is the revised and extended version of RunBugRun. The older version can be found in the legacy branch.

What is RunBugRun

RunBugRun is an APR dataset of over 700'000 executable buggy/fixed pairs of short programs taken from IBM Project CodeNet written in 9 languages (C++, C, Python, Java, Ruby, JavaScript, Go, PHP, C#).

It can be used to evaluate APR tools, that is, tools that automatically find and repair bugs in source code.

RunBugRun comes with tests, bug labels and infrastructure to execute programs. In order to warrant safe execution it uses Bubblewrap as a sandbox.

RunBugRun has pre-defined training, validation and test sets. APR tools can use the training set as they please. For evaluation, they are given a test set of buggy programs that do not pass all tests. A tool's performance is measured as the percentage of programs that the tool can fix in such a way that it passes all tests.

Obtaining the Dataset

RunBugRun is distributed as a lrzip-compressed SQLite3 database dump. After downloading the compressed dump (see Releases page of this project) the following steps are necessary to restore the database.

Install lrzip and sqlite

sudo apt-get install lrzip sqlite3

Decompress

lrunzip -z runbugrun.sql.lrz

Restore

sqlite3 runbugrun.db < runbugrun.sql

Data Sources

RunBugRun is a curated collection of data. We used the following sources. For terms of use/license information please consult the corresponding project/website.

SourceURLData
IBM CodeNethttps://github.com/IBM/Project_CodeNetCode submissions
AlphaCode/CodeContestshttps://github.com/google-deepmind/code_contestsTests
AtCoderhttps://atcoder.jp/posts/21Tests
PIE4Perfhttps://github.com/madaan/pie-perfProblem description translations (e.g., Japanese to English)

Database Schema Documentation

RunBugRun is distributed as a SQLite database, containing code, tests and metadata. It contains the following tables:

tests

Stores test cases.

ColumnTypeDescription
idintegerPrimary key
problem_idvarcharReference to problems.problem_id
test_idintegerTest identifier
inputtextTest input
outputtextExpected output
created_atdatetime(6)Creation timestamp
updated_atdatetime(6)Last update timestamp
originintegerOrigin of the test (0: unknown, 1: codenet, 2: manual, 3: alphacode, 4: atcoder) (default: 0)
activebooleanWhether test is active (default: true)

problems

Stores problem descriptions.

ColumnTypeDescription
idintegerPrimary key
problem_idvarcharIBM CodeNet problem identifier
texttextProblem description
similar_problemsjsonbJSON array of similar problems

bugs

Stores bugs (full code).

ColumnTypeDescription
idintegerPrimary key
buggy_codetextThe buggy code
fixed_codetextThe fixed code
problem_idvarcharReference to problems.problem_id
user_idvarcharID of user who submitted
buggy_submission_idvarcharID of buggy submission (IBM CodeNet ID)
fixed_submission_idvarcharID of fixed submission (IBM CodeNet ID)
languageintegerProgramming language (0: c, 1: cpp, 2: javascript, 3: java, 4: ruby, 5: python, 6: php, 7: go, 8: c_sharp)
label_idsjsonJSON array of label IDs
runtime_errorsjsonRuntime errors
change_countintegerNumber of changes
splitintegerData split (0: train, 1: valid, 2: test, 3: unfiltered)
buggy_main_classvarcharMain class for buggy code (Java only)
fixed_main_classvarcharMain class for fixed code (Java only)
created_atdatetime(6)Creation timestamp
updated_atdatetime(6)Last update timestamp
token_countintegerNumber of tokens
activebooleanWhether bug is active (default: true)
hunk_countintegerNumber of hunks
buggy_locsintegerNumber of lines of code in the buggy version (logical)
fixed_locsintegerNumber of lines of code in the fixed version (logical)

evaluations

Tracks evaluation runs (initially empty).

ColumnTypeDescription
idintegerPrimary key
namevarcharName of the evaluation
started_atdatetime(6)When the evaluation started
ended_atdatetime(6)When the evaluation ended
created_atdatetime(6)Creation timestamp
updated_atdatetime(6)Last update timestamp

runs

Records individual test runs (initially empty).

ColumnTypeDescription
idintegerPrimary key
evaluation_idintegerReference to evaluations.id
statusintegerStatus of the run (0: pass, 1: fail, 2: error, 3: timeout, 4: compilation_error)
bug_idintegerReference to bugs.id
bug_versionintegerVersion of the bug (0: buggy, 1: fixed, 2: candidate)
bug_variantintegerVariant of the bug (0: default)
candidate_indexintegerIndex of candidate
test_idintegerReference to tests.id
error_outputtextError output if any
outputtextProgram output
created_atdatetime(6)Creation timestamp
updated_atdatetime(6)Last update timestamp
wall_timedecimalExecution time in seconds