README.md

June 26, 2026 · View on GitHub

AIBuildAI – An AI agent that automatically builds AI models

_{🏆 #1 on OpenAI MLE-Bench}

Downloads

News

[6/17/2026] The AIBuildAI Agent 2.5 version is made available to Max users: a live build dashboard, run replay, cross-run memory, MCP research tools, and DeepSeek support. Release notes and download.
[5/26/2026] The AIBuildAI Agent 2.0 version is made available to Pro users.
[5/1/2026] In the TGS Salt Identification Challenge hosted by Kaggle, the model automatically developed by our AIBuildAI Agent ranked in the top 5.7%. Among 3,219 teams composed of human experts, this performance reaches the level of top-tier human AI experts. Model and code developed by the Agent: tasks/tgs-salt-identification-challenge.
[4/27/2026] Excited to announce AIBuildAI Agent 2.0! It has once again achieved #1 on OpenAI's MLE-Bench, reaching a score of 70.7% and substantially outperforming the agents ranked 2nd through 5th. Compared to version 1.0, Agent 2.0 introduces several technical advancements, which we will detail in an upcoming technical report. The 2.0 version will be available to Pro users soon.
[4/24/2026] In a heart disease prediction competition hosted by Kaggle, the model automatically developed by our AIBuildAI Agent ranked in the top 6.6%. Among 4,370 teams composed of human experts, this performance reaches the level of top-tier human AI experts. Model and code developed by the agent: tasks/playground-series-s6e2.

https://github.com/user-attachments/assets/b6043d39-43df-464a-8e25-d24006ba99c8

AIBuildAI is an AI agent that automatically builds AI models. Given a task, it runs an agent loop that analyzes the problem, designs models, writes code to implement them, trains them, tunes hyperparameters, evaluates model performance, and iteratively improves the models. By automating the model development workflow, AIBuildAI reduces much of the manual effort required to build AI models.

AIBuildAI Architecture

Current Results

On OpenAI MLE-Bench, AIBuildAI ranked #1, demonstrating strong performance on real-world AI model building tasks.

MLE-Bench Results

Quick Start

AIBuildAI requires a Linux x86_64 machine (Ubuntu 20.04 or newer).

There are three versions. V2.5 (Max) is the current, most capable version. V2 (Pro) and V1 (free) remain available.

Version	Plan	To run it
V2.5 (Max) — current	Max subscription	`aibuildai login` (Max plan) + a model API key (Anthropic or DeepSeek)
V2 (Pro)	Pro subscription	`aibuildai login` (Pro plan) + an Anthropic API key
V1	free	an Anthropic API key (no account)

Subscriptions are managed at accounts.aibuildai.io; see the account page for the available plans.

V2.5 (Max) — current

Subscribe. Create an account at accounts.aibuildai.io/sign-up and switch to the Max plan.

Install.

curl -fsSL https://github.com/aibuildai/AI-Build-AI/releases/download/v2.5-latest/aibuildai-linux-x86_64.tar.gz | tar xz && ./aibuildai-linux-x86_64-*/install.sh

Log in (required before running):

aibuildai login      # opens a browser to sign in
aibuildai whoami     # should show an active Max plan

Set your model-provider API key. This is separate from the aibuildai subscription above: the subscription (aibuildai login) lets you run the product; this key pays for the AI model calls. aibuildai uses Anthropic (Claude) by default, so set your Anthropic API key:
```
export AIBUILDAI_API_KEY=your-anthropic-api-key
```
DeepSeek is also supported — set a deepseek-* id in the config's llm.model and put your DeepSeek key in the same AIBUILDAI_API_KEY.

Run. V2.5 is driven by a YAML config:

aibuildai config > task.yaml    # writes a starter config with every field
# edit task.yaml: set run.task_name, run.data_root, run.instruction, run.playground_root
aibuildai run task.yaml

Other commands: aibuildai memorize (summarize past runs into memory), aibuildai replay <run-dir> (replay a finished run), aibuildai --help.

V2 (Pro)

Subscribe to the Pro plan at accounts.aibuildai.io/sign-up.

Install.

curl -fsSL https://github.com/aibuildai/AI-Build-AI/releases/download/v2.0-latest/aibuildai-linux-x86_64.tar.gz | tar xz && ./aibuildai-linux-x86_64-*/install.sh

Log in.

aibuildai login
aibuildai whoami     # should show an active Pro plan

Set credentials.
```
export ANTHROPIC_API_KEY=your-api-key
```

Run. V2 is driven by command-line flags:

aibuildai run --task-name <name> --data-dir <path> \
  --playground-dir <path> --instruction "$(cat task.md)" --no-form

Or run aibuildai with no flags to fill in the parameters in an interactive form.

V1 (free)

No account or subscription required.

Install.

curl -fsSL https://github.com/aibuildai/AI-Build-AI/releases/download/v1.0-latest/aibuildai-linux-x86_64.tar.gz | tar xz && ./aibuildai-linux-x86_64-*/install.sh

Set your Anthropic API key.
```
export ANTHROPIC_API_KEY=your-api-key
```

Run with command-line flags:

aibuildai --task-name <name> --data-dir <path> \
  --playground-dir <path> --instruction "$(cat task.md)" --no-form

Or run aibuildai with no flags to fill in the parameters in an interactive form.

Important: run the command directly in your terminal. Do not wrap it in a .sh/.bash script — running it through a script may cause the TUI (Text User Interface) to crash.

Results

After a run completes, the output directory usually looks like (structure may slightly vary by task):

├── candidate_1/  candidate_2/  candidate_3/  # Auto-generated training scripts and model checkpoints
├── checkpoint.pth       # Best model checkpoint
├── inference.py         # Standalone inference script for the final model
├── submission.csv       # Test predictions (if test inputs are provided)
└── progress.pdf         # Visual progress report

The main outputs of an AIBuildAI run are the model checkpoints and the script inference.py, which runs predictions with the final model on any data. When the task data folder includes unlabeled test inputs, AIBuildAI also writes a predicted-label file submission.csv.

Tasks

We provide ready-to-run task markdowns and datasets in the tasks/ folder of this repository. You can also write your own task description and point the run at your own dataset.

git clone https://github.com/aibuildai/AI-Build-AI.git

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

@article{zhang2026aibuildai,
    title={AIBuildAI: An AI Agent for Automatically Building AI Models},
    author={Ruiyi Zhang and Peijia Qin and Qi Cao and Li Zhang and Pengtao Xie},
    year={2026},
    journal={arXiv},
    url={https://arxiv.org/abs/2604.14455}
}