NativeAppTemplate Agent

May 9, 2026 · View on GitHub

A Claude Code agent that turns a natural-language spec — something as informal as "a walk-in queue for a barbershop" — into a working three-platform implementation:

  • Rails 8.1 API
  • native SwiftUI iOS
  • native Jetpack Compose Android

Coherent across all three, in under an hour.

npm CI license: MIT node: >=22

npx nativeapptemplate-agent "a walk-in clinic queue for small veterinary practices"

Status: v0.1.2 stable. First built during Built with Opus 4.7: a Claude Code Hackathon (April 21–27, 2026); shipped to npm post-hackathon. Verified end-to-end on the three demo specs below from a fresh /tmp/ cwd. Stage 2 (scripted-CRUD walk-through via mobile-mcp against a live Rails server) and paid-edition parity validation landed post-launch. Active development continues — see the roadmap for what's next.


Why this exists

Most "AI builds an app" tools stop at a single web frontend. The real pain for anyone shipping a mobile product is that the same domain has to be implemented three times — multi-tenant Rails API, native iOS, native Android — each with its own idioms, and keeping them consistent is where weeks disappear.

Classic mobile boilerplates sell "save 12–16 weeks of setup." AI coding tools have compressed that value to 2–3 weeks of AI-assisted work. The durable problem that remains — even with AI — is cross-platform coherence: keeping a Rails API, a native iOS client, and a native Android client all consistent under iteration, with no contract drift, no forgotten rename, no divergent localized copy.

This agent is an answer to that: turn a boilerplate into a generator that produces coherent three-platform implementations on demand, with structural and semantic validation built in.

What it does

Point the agent at a natural-language spec:

a walk-in clinic queue for small veterinary practices

It will:

  1. Parse the spec into a structured domain (entities, fields, relationships, state machines).
  2. Copy the free-edition substrate (three MIT-licensed repos covering Rails + iOS + Android) into ./out/<spec-slug>/{rails,ios,android}/.
  3. Rename the skeletonShop → Clinic, Shopkeeper → Vet, etc. — consistently across Ruby migrations, Swift models, Kotlin data classes, policies, tests, and localized copy.
  4. Adapt or replace the domain module — keep ItemTag for walk-in queue variants; strip and insert a new resource for non-queue SaaS.
  5. Drive the build greenbin/rails test, xcodebuild test, ./gradlew test must all pass before the agent exits.
  6. Validate the output across three layers (structural, runtime, semantic). Details in docs/SPEC.md section 6.

Demo

https://github.com/user-attachments/assets/bd1ed091-93d8-45d7-b502-c21720218484

90-second end-to-end run: spec → renamed Rails API + iOS app + Android app, all three platforms validated. Also on YouTube for full-screen viewing.

Three demo specs, both adapt and replace paths, all four validation layers green end-to-end:

SpecDomain entity (post-rename)PathResult
"a walk-in clinic queue for small veterinary practices"ItemTag → Patient, Shop → Clinic, Shopkeeper → VetadaptLayer 1 3/3 · Layer 2 3/3 · Layer 3 2/2 · Reviewer PASS
"a restaurant waitlist for casual dining"Shop → Restaurant, Shopkeeper → HostadaptLayer 1 3/3 · Layer 2 3/3 · Layer 3 2/2 · Reviewer PASS
"a personal task tracker with due dates"ItemTag → Todo (replaces queue entry entirely)replaceLayer 1 3/3 · Layer 2 3/3 · Layer 3 2/2 · Reviewer PASS

Layer 2 ran in build mode — real xcodebuild build and ./gradlew assembleDebug, full app builds installed on iPhone 17 simulator and Android emulator. Layer 3 captured the home-screen via xcrun simctl io booted screenshot / adb exec-out screencap and judged against the rubric using Opus 4.7 vision (median of 3 samples per criterion). With NATIVEAPPTEMPLATE_VISUAL=2, Layer 2 additionally drives a scripted-CRUD walk-through via mobile-mcp against a live Rails server — Welcome → Sign Up → email-confirm via bin/rails runner → Sign In → drill into the auto-seeded sample — and Layer 3 then judges the post-walk screenshot against a domain-content rubric.

The agent works on either the free (MIT) edition or the paid edition without code changes — the same pipeline handles both substrates; multi-tenant features (org switching, invitations, role permissions) survive the rename pipeline when targeting paid. The agent tests paid first because free is a strict subset of paid; running paid first catches regressions that wouldn't surface against free alone.

Both screenshots are real captures from the booted iOS Simulator and Android emulator post-./gradlew assembleDebug / xcodebuild build, after the agent installed and launched the generated app.

Architecture

flowchart LR
    Spec["Natural-language spec<br/>(e.g. walk-in clinic queue)"] --> Agent
    subgraph Agent["Claude Code Agent · Opus 4.7"]
      Planner --> Workers
      Workers --> Reviewer
      Reviewer --> Judge
    end
    Substrate[("Free-edition substrate<br/>Rails · iOS · Android<br/>READ-ONLY")] -. copy .-> Out
    Agent --> Out["./out/[slug]/<br/>rails / ios / android"]
    Out --> L1["Layer 1 — Structural"]
    L1 --> L2["Layer 2 — Runtime + mobile-mcp"]
    L2 --> L3["Layer 3 — Vision judge"]

Substrate

The agent operates on the free, MIT-licensed edition of NativeAppTemplate — three public repos:

RepoStackLOC
nativeapptemplateapiRails 8.1, PostgreSQL, devise_token_auth, pundit, acts_as_tenant7,687 (Ruby)
NativeAppTemplate-Free-iOS100% SwiftUI, @Observable, MVVM, Liquid Glass design, iOS 26.2+15,311 (Swift)
NativeAppTemplate-Free-Android100% Kotlin, 100% Jetpack Compose, Hilt, Retrofit2, API 26+19,521 (Kotlin)

Combined ~42.5k LOC. Extracted from MyTurnTag Creator, a walk-in queue-management SaaS live on both app stores since 2024.

Usage

# Standalone CLI
npx nativeapptemplate-agent "a walk-in clinic queue for small veterinary practices"

# Stretch specs the agent is also designed to handle
npx nativeapptemplate-agent "a restaurant waitlist for casual dining"
npx nativeapptemplate-agent "a personal task tracker with due dates"

# Generated output appears under ./out/<slug>/
tree ./out/clinic-queue/
# ├── rails/      ← Rails 8.1 API, git-initialized, buildable
# ├── ios/        ← SwiftUI iOS project, buildable
# └── android/    ← Jetpack Compose Android project, buildable

The agent will also be available as a Claude Code plugin.

Requirements

  • Node.js 22+
  • Claude Agent SDK v0.2.111 or later (needed for Opus 4.7)
  • An Anthropic API key with access to claude-opus-4-7, exported as ANTHROPIC_API_KEY:
    export ANTHROPIC_API_KEY="sk-ant-..."
    
    The Anthropic SDK reads this env var automatically; no other config is required. See Security below for storage recommendations.
  • Local checkouts of the three substrate repos, referenced via environment variables:
    export NATIVEAPPTEMPLATE_API="/path/to/nativeapptemplateapi"
    export NATIVEAPPTEMPLATE_IOS="/path/to/NativeAppTemplate-Free-iOS"
    export NATIVEAPPTEMPLATE_ANDROID="/path/to/NativeAppTemplate-Free-Android"
    
    A starter /.env.example lists all the variables in one place.
  • For runtime validation (Layer 2 onwards): Xcode 26.3+ with iOS 26.2+ simulator, Android SDK with API 26+ emulator
  • For UI automation: mobile-next/mobile-mcp (installed automatically as a Claude Code MCP server)

Optional flags

  • NATIVEAPPTEMPLATE_VISUAL=1 — opts the run into Stage 1 visual judging (Layer 3). When set, Layer 2 runs in build mode instead of fast mode (full xcodebuild build + ./gradlew assembleDebug), then for each platform the agent installs the app on the booted sim/emulator, captures the home screen, and judges it with Opus 4.7 vision against DEFAULT_STAGE1_RUBRIC. Adds 60-180s per platform depending on cold-build time. Requires a sim/emulator booted for each platform you want judged. Off by default — npm run dev keeps the existing fast path.
  • NATIVEAPPTEMPLATE_VISUAL=2 — implies =1 and additionally runs Stage 2: the agent boots the generated Rails app under mise exec -- bin/dev (after bundle install + bin/rails db:prepare + bin/rails db:seed_fu), waits for it to listen, then drives the iOS sim and Android emulator through the parameterized queue scenario (Sign Up → email-confirm via bin/rails runner → Sign In → drill into auto-seeded sample). Layer 3 then judges the last captured screenshot against DEFAULT_STAGE2_RUBRIC (domain content + no substrate-token leak). Adds 2–4 minutes per platform on top of =1. Requires both sims/emulators booted and the substrate's mise toolchain installed for bin/dev.
  • NATIVEAPPTEMPLATE_BRIDGE=off — skip writing to ~/.gradle/gradle.properties. The agent normally mirrors NATIVEAPPTEMPLATE_API_* (HOST/PORT/SCHEME) into renamed-product variants (<PRODUCT>_API_*) at run time so the generated Android app picks them up via gradle.properties and the iOS sim launch picks them up via SIMCTL_CHILD_*. Set this to disable the file write (process.env injection still runs for child-spawn paths).
  • NATIVEAPPTEMPLATE_BRIDGE_DRY_RUN=1 — log what would be written to ~/.gradle/gradle.properties instead of writing. Useful before granting the bridge write access to your user-global gradle.
  • NATIVEAPPTEMPLATE_AGENT_ANTHROPIC_KEY — dedicated workspace key, see Security.
  • ANDROID_SERIAL — when more than one Android device/emulator is attached (e.g. a physical device plus a running emulator), adb standard practice is to set ANDROID_SERIAL=<serial> to disambiguate. The agent honors this transparently because it runs adb directly. Run adb devices to list serials. Visual-judge runs with multiple Android targets attached will error with more than one device/emulator if this isn't set.

The agent resolves adb to a known-good binary in this priority order: $ANDROID_HOME/platform-tools/adb, $ANDROID_SDK_ROOT/platform-tools/adb, ~/Library/Android/sdk/platform-tools/adb (Android Studio default), /Applications/android-sdk-macosx/platform-tools/adb, /opt/homebrew/bin/adb, /usr/local/bin/adb, then PATH. This avoids surprises like a stale ~/.apportable/SDK/bin/adb (i386, won't exec on Apple Silicon) shadowing a working adb on PATH.

Validation (three layers)

The agent doesn't just generate code and exit — it validates the output.

  1. Structural. ripgrep for leftover domain tokens; OpenAPI contract parity check between Rails, iOS networking, and Android repository layers. A silent rename inconsistency fails the run before any tests execute.
  2. Runtime. Verify the generated Rails app boots (bin/rails runner 'puts OK'); type-check or build the iOS and Android apps. With NATIVEAPPTEMPLATE_VISUAL=1, escalate to a full build (xcodebuild build + ./gradlew assembleDebug) and install on the booted sim/emulator. With =2, additionally boot the live Rails server and drive a scripted CRUD walk-through via mobile-next/mobile-mcp. Any 4xx/5xx or unhandled client error fails the run.
  3. Semantic. Opus 4.7 as judge — scores whether the generated code and rendered UI actually express the intended domain. Vision judges read simulator/emulator screenshots directly.

See docs/SPEC.md for the full design.

Security

ANTHROPIC_API_KEY is the only sensitive secret the agent needs.

Workspace isolation (optional but recommended). If you also use Claude Code or other Anthropic SDK apps, create a dedicated workspace with its own key + spend cap and export it as NATIVEAPPTEMPLATE_AGENT_ANTHROPIC_KEY. The agent prefers that var over ANTHROPIC_API_KEY when set, so a runaway loop hits the workspace cap instead of your overall tier limit, and revoking it doesn't break Claude Code login.

Recommended storage (best to most convenient):

  • macOS Keychain via 1Password CLIop read "op://Personal/Anthropic/key" resolved at session start; no key on disk in plaintext.
  • direnv — per-project .envrc, loaded only when you cd in. Keep .envrc outside any git-tracked dotfiles repo, or .gitignore it.
  • A gitignored secrets file sourced from your shell rc — e.g. [ -r ~/.config/zsh/secrets.zsh ] && source ~/.config/zsh/secrets.zsh. Set chmod 600 on the file.
  • .env next to the project — copy .env.example to .env (already gitignored along with .env*.local) and the agent loads it on startup. Shell exports take precedence over .env, so you can override per-run with FOO=x npm run dev. Lowest friction; easiest to leak — avoid in shared repos. chmod 600 .env on shared machines.

Don't paste a real key into shell history (HISTFILE captures it), commit a .env, or echo the key into a non-private channel.

The agent strips ANTHROPIC_API_KEY, ANTHROPIC_AUTH_TOKEN, and NATIVEAPPTEMPLATE_AGENT_ANTHROPIC_KEY from the environment of every subprocess it spawns — Ruby scripts, git, psql, xcodebuild, gradlew, the future mobile-mcp client. Keys are only seen by the Anthropic SDK in the Node process. Set spend limits on your API workspace as a backstop, and rotate the key if you suspect leak.

Project docs

  • docs/SPEC.md — full technical specification
  • ROADMAP.md — where this project is headed, OSS vs hosted, what stays out of scope
  • CLAUDE.md — Claude Code project instructions (read if you're running Claude Code against this repo)

Contributing

Issues and PRs welcome. The repository is stable now (v0.1.x) — no more hackathon-pace rewrites. A CONTRIBUTING.md with detailed guidelines will land alongside v0.2.

For now, the simplest path is: open an issue describing what you're trying to do, and we'll figure out the right shape together before code lands. Bug reports with reproducible commands (and the /tmp/<dir>/tmp/trace/ log) are especially welcome.

License

MIT. See LICENSE.

Acknowledgments


Built solo in Tokyo.