Harbor Cookbook

April 23, 2026 · View on GitHub

Docs

Realistic examples of building evals and optimizing agents using Harbor.

Getting Started

Install Harbor:

uv tool install harbor

Run any task recipe:

harbor run -p harbor_cookbook/recipes/<name> -a claude-code -m anthropic/claude-opus-4-6

Task Recipes

NameDescription
simple‑taskMinimal single-container task.
multi‑containerDocker Compose task where the agent interacts with a locally hosted REST API.
mcp‑toolsGiving the agent custom tools via a locally hosted FastMCP server.
multi‑rewardMultiple independent verifiers each producing their own score.
simulated‑userAgent discovers requirements by talking to a simulated user.
computer‑use‑ubuntuComputer use reference implementation on an Ubuntu virtual desktop.
computer‑use‑windowsComputer use reference implementation on a remote Windows desktop (Daytona).
dns‑blacklistingNetwork-level hostname blacklisting with exact, wildcard, and regex rules.
skillsGiving agents access to custom skills.
multi‑stepOrdered multi-step task with per-step instructions, tests, workdir uploads, healthcheck, early stopping, and per-step artifacts.

Optimization Examples

NameDescription
harbor‑rlRL training on Harbor tasks using harbor.rl + Tinker.
gepaAgent harness optimization for MedAgentBench using Harbor+GEPA.
tinker‑rlRL training on Harbor tasks using the Tinker Cookbook integration.
prime‑rlRL training on Harbor tasks using Prime RL and Verifiers.
sky‑rlRL training on Harbor tasks using SkyRL.