Gemini CLI: Prompt-Driven Sub-Agent Orchestrator

July 26, 2025 ยท View on GitHub

This project is a proof-of-concept demonstrating a sub-agent orchestration system built entirely within the Gemini CLI using its native features. It uses a filesystem-as-state architecture, managed by a suite of prompt-driven custom commands, to orchestrate complex, asynchronous tasks performed by specialized AI agents.

Core Concepts

  1. Filesystem-as-State: The entire state of the system (task queue, plans, logs) is stored in structured directories on the filesystem, making it transparent and easily debuggable. There are no external databases or process managers.

  2. Prompt-Driven Commands: The logic for the orchestrator is not written in a traditional programming language. Instead, it's defined in a series of prompts within .toml files, which create new, project-specific commands in the Gemini CLI (e.g., /agents:start).

  3. Asynchronous Agents: Sub-agents are launched as background processes. The orchestrator tracks them via their Process ID (PID) and reconciles their status by checking for a sentinel .done file upon their completion.

Architecture

  • Orchestrator: A set of custom Gemini CLI commands (/agents:*) that manage the entire lifecycle of agent tasks, from creation to completion.
  • Sub-Agents: Specialized Gemini CLI extensions, each with a unique persona and a constrained set of capabilities (e.g., coder-agent, reviewer-agent).

Directory Structure

The entire system is contained within the .gemini/ directory. This image shows the structure of the agents and commands directories that power the system.

Project Folder Structure
  • agents/: Contains the definitions for the sub-agents and the workspace where they operate.
    • tasks/: Contains the JSON state files for each task and .done sentinel files.
    • plans/: Holds Markdown files for agents' long-term planning.
    • logs/: Stores the output logs from each agent's background process.
    • workspace/: A dedicated directory where agents can create and modify files.
  • commands/: Contains the .toml files that define the custom /agents commands.

Commands

  • /agents:start <agent_name> "<prompt>": Queues a new task by creating a JSON file in the tasks directory.
  • /agents:run: Executes the oldest pending task by launching the corresponding agent as a background process.
  • /agents:status: Reports the status of all tasks. It first reconciles any completed tasks by checking for .done files.
  • /agents:type: Lists the available agent extensions.

Example Workflow

  1. Queue a Task:

    gemini /agents:start coder-agent "in a folder, use html/css/js (nicely designed) to build an app that looks at github.com/pauldatta and is a one-stop view of the repos and what they have been built for (public repos)"
    

    Output: Task task_20250726T183100Z created for agent 'coder-agent' and is now pending.

  2. Run the Orchestrator:

    gemini /agents:run
    

    Output: Orchestrator started task task_20250726T183100Z (PID: 13539) in the background.

  3. Check the Status (While Running):

    gemini /agents:status
    

    Output:

    Task IDAgentStatusCreated AtPIDPrompt
    task_20250726T183100Zcoder-agentrunning2025-07-26T18:31:00Z13539in a folder, use html/css/js...
  4. Check the Status (After Completion): After the agent is finished, the next run of /agents:status will first reconcile the task and then display the final state.

    gemini /agents:status
    

    Output: Task task_20250726T183100Z has been marked as complete.

    Task IDAgentStatusCreated AtPIDPrompt
    task_20250726T183100Zcoder-agentcomplete2025-07-26T18:31:00Z13539in a folder, use html/css/js...

Final Output

The coder-agent successfully creates a web application in the .gemini/agents/workspace/github-repo-viewer directory. Here is a screenshot of the final running application:

GitHub Repo Viewer Screenshot


Further Reading


Disclaimer

This project is a proof-of-concept experiment.

  • Inspiration: The core architecture is inspired by Anthropic's documentation on Building a Sub-Agent with Claude.
  • Roadmap: A more robust and official agentic feature is on the Gemini CLI roadmap.
  • Security: This implementation is not secure for production use. It relies on the -y (--yolo) flag, which bypasses important security checks. For any real-world application, you should enable features like checkpointing and sandboxing. For more information, please refer to the official Gemini CLI documentation.