Loop Factory

June 8, 2026 ยท View on GitHub

A simple system for handing coding tasks to AI agents (Claude Code or Codex) and getting reviewable work back โ€” without losing track of what's happening.

If you've ever told an AI "build me this feature," watched it churn out code, and then had no idea whether it actually did what you asked... Loop Factory fixes that. It turns the messy back-and-forth of working with coding agents into a tidy assembly line you can see and control.


The big idea in one picture

Every task you want an agent to do is written down as a spec (a plain markdown file). That spec moves through three folders as work progresses:

   ๐Ÿ“ฅ inbox/          ๐Ÿ”ง active/           ๐Ÿ“ฆ archive/
   ----------         ----------          -----------
   ideas you          stuff an agent      finished work,
   haven't       โ”€โ”€โ–บ  is working on   โ”€โ”€โ–บ reviewed and
   started yet        right now           accepted

That's it. The folder a spec lives in tells you exactly what state it's in. No dashboard, no database โ€” just files you can read.

Think of it like a kitchen:

  • inbox = order tickets waiting to be cooked
  • active = dishes currently on the stove
  • archive = plates that passed inspection and went out the door

A task only moves forward when you (or an explicit command) says so. The agent cooks; it doesn't decide the menu.


Why this exists (the honest version)

Asking an agent to do one thing is easy. Asking it to do the same kind of thing 50 times โ€” reliably, where you can review each result โ€” falls apart fast. You lose track of what was asked, what got built, and whether it was any good.

Loop Factory keeps four things straight:

  1. What you asked for lives in a spec file, written before any code is generated.
  2. What the agent did gets reviewed before it counts as done.
  3. The state of everything is just which folder a file is in โ€” always visible.
  4. The AI never decides product direction. It implements and verifies. You decide what to build.

That last point is the whole philosophy: automate the typing and the checking, not the thinking about what's worth building.


The loop, step by step

Here's the full cycle a task goes through. You can do this by hand or let the CLI help.

  1. Write a spec โ†’ drop a markdown file in factory/specs/inbox/. It describes the task and how you'll know it's done.
  2. Dispatch it โ†’ Loop Factory writes a clean, complete prompt for your agent and moves the spec to active/.
  3. The agent builds it โ†’ working against the "acceptance criteria" you wrote (your definition of "done").
  4. Verify โ†’ run the checks listed in the spec (usually tests).
  5. Review โ†’ a reviewer (you or a review agent) looks at the diff and the test results. Pass or fail.
  6. Archive โ†’ if it passed, the spec moves to archive/. If not, it stays in active/ and gets another pass.
  7. Backpropagate โ†’ if building the feature taught you something new about how the system works, that learning gets written back into the specs/docs so they stay true.

Steps 2, 5, and 7 are the ones the CLI automates for you.


Quick start

Install it (Python only, no extra dependencies):

python3 -m pip install -e .

Check your setup and see what's in the queue:

loop-factory doctor      # is everything wired up correctly?
loop-factory scan        # what specs do I have, and what state are they in?

Hand the first inbox spec to an agent:

# Using Claude Code:
loop-factory dispatch --agent claude --stage     # writes the prompt + moves spec to active/
loop-factory review lf-0001 --agent claude        # writes a review prompt
loop-factory backprop --agent claude              # writes a "update the docs" prompt from your changes

# Or using Codex โ€” same commands, just swap the agent:
loop-factory dispatch --agent codex --stage
loop-factory review lf-0001 --agent codex
loop-factory backprop --agent codex

Run built-in loops:

loop-factory loops list
loop-factory loops show spec-grill-gate
loop-factory loops prompt active-spec-verify --agent codex

Note on --execute: By default, these commands just write a prompt file for you to use โ€” they don't run the AI. Add --execute to actually run your local codex or claude CLI (you'll need it installed and logged in). Start without --execute so you can see what's being generated.


Writing a spec

A spec is a markdown file with a small header (frontmatter) and a body. Here's the header:

---
id: lf-0001                      # unique id for this task
title: Add user login           # short description
agent: any                       # codex, claude, or any
risk: medium                     # low / medium / high โ€” how careful to be
verification:                    # commands that prove it works
  - python3 -m unittest discover
---

The body should cover four things, in plain language:

  • Context โ€” what problem this solves and why now.
  • Acceptance Criteria โ€” the checklist that means "done." Be specific; this is what the agent builds toward and what the reviewer checks against.
  • Constraints โ€” what not to do (e.g. "don't add new dependencies").
  • Review Notes โ€” what the reviewer should look at, and any known risks.

There's a ready-made template at factory/templates/spec.md โ€” copy it.

Before you dispatch a spec, it helps to interrogate it: Who owns this decision? What's explicitly out of scope? What's the riskiest assumption? What's the smallest version that would be acceptable? Answer those, write them down under a # Grill Gate section, and you'll get far better results โ€” because a vague spec produces vague code. The template includes these prompts, and loop-factory doctor will flag specs that skipped the gate.


The full command list

loop-factory init                            # create the factory/ folders
loop-factory scan                            # list inbox + active specs
loop-factory scan --json                     # same, machine-readable
loop-factory prompt <spec> --agent claude    # generate one implementation prompt
loop-factory dispatch --agent claude --stage # prompt + move to active
loop-factory review <id> --agent claude      # generate a review prompt
loop-factory archive <id> --accepted         # move a passed spec to archive
loop-factory backprop --agent claude         # generate a "sync docs with code" prompt
loop-factory loops list                      # list built-in loops
loop-factory loops show <loop-id>            # inspect one built-in loop
loop-factory loops prompt <loop-id>          # generate a loop prompt
loop-factory doctor                          # health-check your setup

dispatch takes --limit N to process more than one inbox spec at a time. archive requires --accepted โ€” a deliberate speed bump so nothing gets archived without you confirming it passed review.

You can also run it without installing, straight from the repo:

python3 bin/loop-factory scan

What's in the box

factory/
  specs/
    inbox/      # tasks waiting to start
    active/     # tasks in progress
    archive/    # finished, accepted tasks (organized by year)
  prompts/      # the prompts Loop Factory generates for agents
  runs/         # a record of each dispatch (which spec, which agent, when)
  reviews/      # records of accepted reviews
  templates/    # starter templates for specs and reviews
  logs/         # run logs

The prompts/, runs/, and reviews/ folders are generated artifacts โ€” Loop Factory's paper trail. You read them; you don't usually edit them. The source of truth is always the spec.


Works with both Claude Code and Codex

Loop Factory ships native integrations for both agents so they feel at home in this repo:

Claude Code โ€” skills in .claude/skills/, subagents in .claude/agents/, a loop prompt at .claude/loop.md, and hooks in .claude/settings.json.

Codex โ€” skills in .agents/skills/, agents in .codex/agents/, hooks in .codex/hooks.json, and a GitHub Action at .github/workflows/codex-review.yml for automated PR reviews.

You don't need to understand these to use Loop Factory โ€” they're there so the agents already know the rules of this repo when they start working.


Don't have the agent CLIs yet?

Codex CLI:

curl -fsSL https://chatgpt.com/codex/install.sh | sh
# or
npm install -g @openai/codex

Claude Code:

curl -fsSL https://claude.ai/install.sh | bash
# or
brew install --cask claude-code

Full setup notes: docs/install.md.


Going deeper


The one rule to remember

Agents implement and verify. You decide what's worth building.

Everything in Loop Factory โ€” the folders, the specs, the separate review gate โ€” exists to keep that line clear. It's software-factory discipline for the AI era, without pretending the AI should run the company.