Commit0

October 24, 2024 · View on GitHub

Updates

Sep 28, 2024:

If you want to use agent with the OpenAI o1 models, please run these installation commands to update packages pip install git+https://github.com/wenting-zhao/aider.git.

Commit0

Commit0 is a from scratch AI coding challenge. Can you create a library from commit 0?

The benchmark consists of 57 core Python libraries. The challenge is to rebuild these libraries and pass their unit tests. All libraries have:

Significant test coverage
Detailed specification and documentation
Lint and type checking

Commit0 is an interactive environment that makes it easy to design and test new agents. You can:

Efficiently run tests in isolated environments
Distribute testing and development across cloud systems
Track and log all changes made throughout.

To install Commit0, run:

pip install commit0

Commit0 provides several commands to facilitate the process of cloning, building, testing, and evaluating repositories. Here's an overview of the available commands:

Setup

Use commit0 setup [OPTIONS] REPO_SPLIT to clone a repository split. Available options include:

Argument	Type	Description	Default
`repo_split`	str	Split of repositories to clone
`--dataset-name`	str	Name of the Huggingface dataset	`wentingzhao/commit0_combined`
`--dataset-split`	str	Split of the Huggingface dataset	`test`
`--base-dir`	str	Base directory to clone repos to	`repos/`
`--commit0-config-file`	str	Storing path for stateful commit0 configs	`.commit0.yaml`

Build

Use commit0 build [OPTIONS] to build the Commit0 split chosen in the Setup stage. Available options include:

Argument	Type	Description	Default
`--num-workers`	int	Number of workers	`8`
`--commit0-config-file`	str	Path to the commit0 dot file	`.commit0.yaml`
`--verbose`	int	Verbosity level (1 or 2)	`1`

Get Tests

Use commit0 get-tests REPO_NAME to get tests for a Commit0 repository.

Argument	Type	Description	Default
`repo_name`	str	Name of the repository to get tests for

Test

Use commit0 test [OPTIONS] REPO_OR_REPO_PATH [TEST_IDS] to run tests on a Commit0 repository. Available options include:

Argument	Type	Description	Default
`repo_or_repo_path`	str	Directory of the repository to test
`test_ids`	str	Test IDs to run
`--branch`	str	Branch to test
`--backend`	str	Backend to use for testing	`modal`
`--timeout`	int	Timeout for tests in seconds	`1800`
`--num-cpus`	int	Number of CPUs to use	`1`
`--reference`	bool	Test the reference commit	`False`
`--coverage`	bool	Get coverage information	`False`
`--rebuild`	bool	Rebuild an image	`False`
`--commit0-config-file`	str	Path to the commit0 dot file	`.commit0.yaml`
`--verbose`	int	Verbosity level (1 or 2)	`1`
`--stdin`	bool	Read test names from stdin	`False`

Evaluate

Use commit0 evaluate [OPTIONS] to evaluate the Commit0 split chosen in the Setup stage. Available options include:

Argument	Type	Description	Default
`--branch`	str	Branch to evaluate
`--backend`	str	Backend to use for evaluation	`modal`
`--timeout`	int	Timeout for evaluation in seconds	`1800`
`--num-cpus`	int	Number of CPUs to use	`1`
`--num-workers`	int	Number of workers to use	`8`
`--reference`	bool	Evaluate the reference commit	`False`
`--coverage`	bool	Get coverage information	`False`
`--commit0-config-file`	str	Path to the commit0 dot file	`.commit0.yaml`
`--rebuild`	bool	Rebuild images	`False`

Lint

Use commit0 lint [OPTIONS] REPO_OR_REPO_DIR to lint files in a repository. Available options include:

Argument	Type	Description	Default
`repo_or_repo_dir`	str	Directory of the repository to test
`--files`	List[Path]	Files to lint (optional)
`--commit0-config-file`	str	Path to the commit0 dot file	`.commit0.yaml`
`--verbose`	int	Verbosity level (1 or 2)	`1`

Save

Use commit0 save [OPTIONS] OWNER BRANCH to save the Commit0 split to GitHub. Available options include:

Argument	Type	Description	Default
`owner`	str	Owner of the repository
`branch`	str	Branch to save
`--github-token`	str	GitHub token for authentication
`--commit0-config-file`	str	Path to the commit0 dot file	`.commit0.yaml`

Agent

Config

Use agent config [OPTIONS] AGENT_NAME to set up the configuration for an agent. Available options include:

Argument	Type	Description	Default
`agent_name`	str	Agent to use, we only support aider for now.	`aider`
`--model-name`	str	LLM model to use, check here for all supported models.	`claude-3-5-sonnet-20240620`
`--use-user-prompt`	bool	Use a custom prompt instead of the default prompt.	`False`
`--user-prompt`	str	The prompt sent to agent.	See code for details.
`--run-tests`	bool	Run tests after code modifications for feedback. You need to set up `docker` or `modal` before running tests, refer to commit0 docs.	`False`
`--max-iteration`	int	Maximum number of agent iterations.	`3`
`--use-repo-info`	bool	Include the repository information.	`False`
`--max-repo-info-length`	int	Maximum length of the repository information to use.	`10000`
`--use-unit-tests-info`	bool	Include the unit tests information.	`False`
`--max-unit-tests-info-length`	int	Maximum length of the unit tests information to use.	`10000`
`--use-spec-info`	bool	Include the spec information.	`False`
`--max-spec-info-length`	int	Maximum length of the spec information to use.	`10000`
`--use-lint-info`	bool	Include the lint information.	`False`
`--max-lint-info-length`	int	Maximum length of the lint information to use.	`10000`
`--pre-commit-config-path`	str	Path to the pre-commit config file. This is needed for running `lint`.	`.pre-commit-config.yaml`
`--agent-config-file`	str	Path to write the agent config.	`.agent.yaml`

Running

Use agent run [OPTIONS] BRANCH to execute an agent on a specific branch. Available options include:

Argument	Type	Description	Default
`branch`	str	Branch to run the agent on, you can specific the name of the branch
`--backend`	str	Test backend to run the agent on, ignore this option if you are not adding `run_tests` option to agent.	`modal`
`--log-dir`	str	Log directory to store the logs.	`logs/aider`
`--max-parallel-repos`	int	Maximum number of repositories for agent to run in parallel. Running in sequential if set to 1.	`1`
`--display-repo-progress-num`	int	Number of repo progress displayed when running.	`5`