RepoLaunch Agent Tutorial

May 31, 2026 ยท View on GitHub

Dependencies

Pre-install: Git, Python>=3.12, Docker

Now RepoLaunch supports Linux, Windows and Android build. Android images are built from Linux images, so the settings are the same as Linux. Linux images and Android images can run on linux docker and Docker Desktop (windows/macos). For helpers to run RepoLaunch on Windows container, see Development-Windows.md

pip install -e .

Run RepoLaunch

We provide an example input file data/examples/dataset.jsonl and a run config data/examples/config.json in examples to help you quickly go through the launch process.

Before getting started, please set your TAVILY_API_KEY environment variable. We use tavily for LLM search engine support.

export TAVILY_API_KEY=...

We use LiteLLM for max compatibility of LLM API, AND to enable custom API deployment for agentic training. Export your LLM API KEY, say OPENAI_API_KEY, ANTHROPIC_API_KEY...

export OPENAI_API_KEY=...

We have made launch/launch/utilities/llm.py compatible to both traditional completion API and OpenAI responses API. If your llm provider requires user identity login for API usage or requires some weird settings like Gemini thinking signaturue, go to modify launch/launch/utilities/llm.py.

Start repo launch process:

launch data/examples/config.json
# equivalently: python -m launch.run --config-path data/examples/config.json

Input

For the input data used to set up the environment, we require the following fields:

FieldDescription
instance_idUnique identifier of the instance
repoFull name of the repository like {user_name}/{project_name}
base_commitCommit to check out
languageMain language of the repo
created_at(Optional) Creation time of the instance, used to support time-aware environment setup, useful in Python
hints(Optional) Any hints for setting up the repo you want to give the agent, such as GitHub run checks info

Run Config

Step 1 Setup

RepoLaunch is a two step process, the first step is to setup the repo, installing dependencies, build the repo and find test cases to test the build of the repo. The following configs are required.

FieldTypeDescription
print_to_consolebooleanWhether to print logs to console
model_configdictPut all arguments for litellm response completion in this dict {"model": "openai/gpt-5.4", ...}. The "model" field should follow formats in litellm document, usually {provider_name}/{model_name}. Put other arguments for litellm response completion here, such as base_url, temperature, top_p.
workspace_rootstringWorkspace folder for one run
datasetstringPath to the dataset file
instance_idstringSpecific instance ID to run, null to run all instances in the dataset
first_N_reposintegerLimit processing to first N repos (-1 for all repos)
max_workersintegerNumber of parallel workers for processing
overwritebooleanWhether to overwrite existing results (false will skip existing repos)
osstrWhich docker image os architecture to build on. Default to linux -- use linux containers on linux machines or wsl. Can also choose: windows -- use windows containers on windows host; android -- use android containers which are built from linux containers on linux machines or wsl.
max_trialsintegerhow many rounds of setup-verify loop agent can attempt, default 1
max_steps_setupintegerhow many steps agent can attemp to setup the environment, default 20
max_steps_verifyintegerhow many steps agent can attemp to verify the setup, default 20
cmd_timeoutintegertime limit in minute of llm's each shell command, default 30 min. Suggested: 80 for Linux and 120 for Windows.
image_prefixstringprefix of the output_image in the format {namespace}/{dockerhub_repo}, defaults to repolaunch/dev

Step 2 Organize

RepoLaunch also provides a second optional step to

  1. Organize the commands to rebuild to repo after edits of the source code;
  2. Organize the commands to test the repo with verbose testcase-status output, write a python script to parse the output into clean testcase-status mapping in JSON format: { "testcase1": "pass", "testcase2": "fail", "testcase3": "skip", };
  3. Make best effort to find the command to run a single testcase separately.

The configs required for this step:

FieldTypeDescription
modedictdefault to {"setup": true, "organize": false}, set to {"setup": true, "organize": true} to do the two steps together, or set to {"setup": false, "organize": true} to do the second step separately AFTER the first step is DONE
max_steps_organizeintegerhow many steps agent can attemp to organize the commands, default 20

Output

The per-instance output will be saved in {workspace_root}/playground/{instance_id}/result.json.

LLM API logs (input/output/token_count/cost) will be saved in {workspace_root}/playground/{instance_id}/llm/

Step 1 Setup

FieldDescription
instance_idUnique identifier of the instance
docker_image_layers{"base_image": ..., "setup_layer": list[commands]}, can convert to Dockerfile
docker_imageCommited Image
setup_commandsRecords of shell commands used to set up the environment
test_commandsRecords of shell commands used to run the tests with verbose output
durationTime taken to run the process (in minutes)
completedBoolean indicating whether the execution completed successfully
exceptionError message or null if no exception occurred

Summary would be saved to {workspace_root}/setup.jsonl

Step 2 Organize

The setup_commands and test_commands of the first step would be noisy, with useless error commands and exploration commands. This is why we design the second step. The second step output would add these fields:

FieldDescription
docker_image_layers{"base_image": ..., "setup_layer": list[commands], "organize_layer": list[commands]}, can convert to Dockerfile
organize_durationTime taken to run the process (in minutes)
organize_completedBoolean indicating whether the organization attempt completed successfully
rebuild_commandsMinimal commands to rebuild the repo instance
test_commandsClean test commands
parsepython script to parse the test output intp testcase-status mapping
test_statusParsed testcase-status mapping in JSON
pertest_commandCommand to specify a testcase to run, might do not exists

Summary would be saved to {workspace_root}/organize.jsonl

Helper scripts

To use launch result

from launch.api import LaunchedInstance
import json
from typing import Literal

# load an instance from organize.jsonl
with open("..../organize.jsonl") as f:
    instance_list = [json.loads(i) for i in f]
instance_dict = instance_list[0]

# Object Oriented API
instance: LaunchedInstance = LaunchedInstance(instance_dict, "linux") # or "windows" for windows image

###### To use the testlog parser to get current test statuses ######
success, build_log = instance.build(verbose = False)
log: str = instance.test()
status: dict[str, Literal['pass', 'fail', 'skip']] = instance.parse_test_log(log)

# Equivalently:
status: dict[str, Literal['pass', 'fail', 'skip']] = instance.build_test_parse(verbose = True)

print(status)
# {"testcase1": "pass", "testcase2": "fail", "testcase3": "skip"}

del instance # to release docker container

###### To evaluate the effect of a new diff patch ######

instance: LaunchedInstance = LaunchedInstance(instance_dict, "linux")

# load your diff_patch
# for example, for swe bench format instance:
diff_patch = instance_dict["test_patch"]

instance.apply_patch(diff_patch, verbose=True)
after_patch_status: dict[str, Literal['pass', 'fail', 'skip']] = instance.build_test_parse(verbose = True)

###### Other Utilities ######

# if you need to save the changes
success, log = instance.git_commit(your_message)
instance.commit_to_image(image_name="experiment", tag="1")

# for custom bash command into the docker container:
res = instance.container.send_command(your_command)
print(res.metadata.exit_code, res.output, sep = "\n")

del instance # to release docker container

If launch was interrupted, you can collect summary manually

python -m launch.scripts.collect\
    --workspace  data/test1  --step setup  # or organize

To upload the result to dockerhub

docker login

python -m launch.scripts.upload_docker\
    --dataset  data/test1/organize.jsonl\
    --clear_after_push 0 # 0 for false and 1 for true

Re-assemble Dockerfile

Reconstruct Dockerfile of a commited image from instance["docker_image_layers"].

The Dockerfile behavior strictly aligns with that of RepoLaunch-created images. It produces two layers (the setup layer and the organize layer) with error commands silenty bypassed instead of interuptting the build.

python -m launch.scripts.gen_dockerfile  \
    --dataset data/..../organize.jsonl  \
    --platform linux \
    # or windows 
    --output_dir data/dockerfiles

Android images are also linux-arch, so use --platform=linux