RepoLaunch Agent Tutorial

May 31, 2026 · View on GitHub

Dependencies

Pre-install: Git, Python>=3.12, Docker

Now RepoLaunch supports Linux, Windows and Android build. Android images are built from Linux images, so the settings are the same as Linux. Linux images and Android images can run on linux docker and Docker Desktop (windows/macos). For helpers to run RepoLaunch on Windows container, see Development-Windows.md

pip install -e .

Run RepoLaunch

We provide an example input file data/examples/dataset.jsonl and a run config data/examples/config.json in examples to help you quickly go through the launch process.

Before getting started, please set your TAVILY_API_KEY environment variable. We use tavily for LLM search engine support.

export TAVILY_API_KEY=...

We use LiteLLM for max compatibility of LLM API, AND to enable custom API deployment for agentic training. Export your LLM API KEY, say OPENAI_API_KEY, ANTHROPIC_API_KEY...

export OPENAI_API_KEY=...

We have made launch/launch/utilities/llm.py compatible to both traditional completion API and OpenAI responses API. If your llm provider requires user identity login for API usage or requires some weird settings like Gemini thinking signaturue, go to modify launch/launch/utilities/llm.py.

Start repo launch process:

launch data/examples/config.json
# equivalently: python -m launch.run --config-path data/examples/config.json

Input

For the input data used to set up the environment, we require the following fields:

Field	Description
`instance_id`	Unique identifier of the instance
`repo`	Full name of the repository like {user_name}/{project_name}
`base_commit`	Commit to check out
`language`	Main language of the repo
`created_at`	(Optional) Creation time of the instance, used to support time-aware environment setup, useful in Python
`hints`	(Optional) Any hints for setting up the repo you want to give the agent, such as GitHub run checks info

Run Config

Step 1 Setup

RepoLaunch is a two step process, the first step is to setup the repo, installing dependencies, build the repo and find test cases to test the build of the repo. The following configs are required.

Field	Type	Description
`print_to_console`	boolean	Whether to print logs to console
`model_config`	dict	Put all arguments for litellm response completion in this dict {"model": "openai/gpt-5.4", ...}. The "model" field should follow formats in litellm document, usually {provider_name}/{model_name}. Put other arguments for litellm response completion here, such as base_url, temperature, top_p.
`workspace_root`	string	Workspace folder for one run
`dataset`	string	Path to the dataset file
`instance_id`	string	Specific instance ID to run, null to run all instances in the dataset
`first_N_repos`	integer	Limit processing to first N repos (-1 for all repos)
`max_workers`	integer	Number of parallel workers for processing
`overwrite`	boolean	Whether to overwrite existing results (false will skip existing repos)
`os`	str	Which docker image os architecture to build on. Default to `linux` -- use linux containers on linux machines or wsl. Can also choose: `windows` -- use windows containers on windows host; `android` -- use android containers which are built from linux containers on linux machines or wsl.
`max_trials`	integer	how many rounds of setup-verify loop agent can attempt, default 1
`max_steps_setup`	integer	how many steps agent can attemp to setup the environment, default 20
`max_steps_verify`	integer	how many steps agent can attemp to verify the setup, default 20
`cmd_timeout`	integer	time limit in minute of llm's each shell command, default 30 min. Suggested: 80 for Linux and 120 for Windows.
`image_prefix`	string	prefix of the output_image in the format {namespace}/{dockerhub_repo}, defaults to repolaunch/dev

Step 2 Organize

RepoLaunch also provides a second optional step to

Organize the commands to rebuild to repo after edits of the source code;
Organize the commands to test the repo with verbose testcase-status output, write a python script to parse the output into clean testcase-status mapping in JSON format: { "testcase1": "pass", "testcase2": "fail", "testcase3": "skip", };
Make best effort to find the command to run a single testcase separately.

The configs required for this step:

Field	Type	Description
`mode`	dict	default to {"setup": true, "organize": false}, set to {"setup": true, "organize": true} to do the two steps together, or set to {"setup": false, "organize": true} to do the second step separately AFTER the first step is DONE
`max_steps_organize`	integer	how many steps agent can attemp to organize the commands, default 20

Output

The per-instance output will be saved in {workspace_root}/playground/{instance_id}/result.json.

LLM API logs (input/output/token_count/cost) will be saved in {workspace_root}/playground/{instance_id}/llm/

Step 1 Setup

Field	Description
`instance_id`	Unique identifier of the instance
`docker_image_layers`	{"base_image": ..., "setup_layer": list[commands]}, can convert to Dockerfile
`docker_image`	Commited Image
`setup_commands`	Records of shell commands used to set up the environment
`test_commands`	Records of shell commands used to run the tests with verbose output
`duration`	Time taken to run the process (in minutes)
`completed`	Boolean indicating whether the execution completed successfully
`exception`	Error message or `null` if no exception occurred

Summary would be saved to {workspace_root}/setup.jsonl

Step 2 Organize

The setup_commands and test_commands of the first step would be noisy, with useless error commands and exploration commands. This is why we design the second step. The second step output would add these fields:

Field	Description
`docker_image_layers`	{"base_image": ..., "setup_layer": list[commands], "organize_layer": list[commands]}, can convert to Dockerfile
`organize_duration`	Time taken to run the process (in minutes)
`organize_completed`	Boolean indicating whether the organization attempt completed successfully
`rebuild_commands`	Minimal commands to rebuild the repo instance
`test_commands`	Clean test commands
`parse`	python script to parse the test output intp testcase-status mapping
`test_status`	Parsed testcase-status mapping in JSON
`pertest_command`	Command to specify a testcase to run, might do not exists

Summary would be saved to {workspace_root}/organize.jsonl

Helper scripts

To use launch result

from launch.api import LaunchedInstance
import json
from typing import Literal

# load an instance from organize.jsonl
with open("..../organize.jsonl") as f:
    instance_list = [json.loads(i) for i in f]
instance_dict = instance_list[0]

# Object Oriented API
instance: LaunchedInstance = LaunchedInstance(instance_dict, "linux") # or "windows" for windows image

###### To use the testlog parser to get current test statuses ######
success, build_log = instance.build(verbose = False)
log: str = instance.test()
status: dict[str, Literal['pass', 'fail', 'skip']] = instance.parse_test_log(log)

# Equivalently:
status: dict[str, Literal['pass', 'fail', 'skip']] = instance.build_test_parse(verbose = True)

print(status)
# {"testcase1": "pass", "testcase2": "fail", "testcase3": "skip"}

del instance # to release docker container

###### To evaluate the effect of a new diff patch ######

instance: LaunchedInstance = LaunchedInstance(instance_dict, "linux")

# load your diff_patch
# for example, for swe bench format instance:
diff_patch = instance_dict["test_patch"]

instance.apply_patch(diff_patch, verbose=True)
after_patch_status: dict[str, Literal['pass', 'fail', 'skip']] = instance.build_test_parse(verbose = True)

###### Other Utilities ######

# if you need to save the changes
success, log = instance.git_commit(your_message)
instance.commit_to_image(image_name="experiment", tag="1")

# for custom bash command into the docker container:
res = instance.container.send_command(your_command)
print(res.metadata.exit_code, res.output, sep = "\n")

del instance # to release docker container

If launch was interrupted, you can collect summary manually

python -m launch.scripts.collect\
    --workspace  data/test1  --step setup  # or organize

To upload the result to dockerhub

docker login

python -m launch.scripts.upload_docker\
    --dataset  data/test1/organize.jsonl\
    --clear_after_push 0 # 0 for false and 1 for true

Re-assemble Dockerfile

Reconstruct Dockerfile of a commited image from instance["docker_image_layers"].

The Dockerfile behavior strictly aligns with that of RepoLaunch-created images. It produces two layers (the setup layer and the organize layer) with error commands silenty bypassed instead of interuptting the build.

python -m launch.scripts.gen_dockerfile  \
    --dataset data/..../organize.jsonl  \
    --platform linux \
    # or windows 
    --output_dir data/dockerfiles

Android images are also linux-arch, so use --platform=linux