Chapter 3: Planning vs Drafting Execution Modes

April 13, 2026 ยท View on GitHub

Welcome to Chapter 3: Planning vs Drafting Execution Modes. In this part of Shotgun Tutorial: Spec-Driven Development for Coding Agents, you will build an intuitive mental model first, then move into concrete implementation details and practical production tradeoffs.

Shotgun exposes two user-facing execution modes with different tradeoffs.

Mode Comparison

ModeBehaviorBest For
Planningstep-by-step confirmations and checkpointshigh-risk or high-complexity work
Draftingcontinuous execution with fewer interruptionswell-scoped work where speed matters

Practical Guidance

Use Planning when:

  • requirements are still evolving
  • cross-cutting changes affect many files
  • you need signoff checkpoints for team review

Use Drafting when:

  • plan is already validated
  • workflow is repetitive
  • you are optimizing for cycle time

Operator Controls

  • mode switching is available in TUI
  • planner checkpoints help catch drift early
  • drafting reduces manual overhead for mature flows

Source References

Summary

You can now choose execution mode based on risk, ambiguity, and throughput needs.

Next: Chapter 4: Codebase Indexing and Context Retrieval

Source Code Walkthrough

evals/models.py

The TraceRef class in evals/models.py handles a key part of this chapter's functionality:



class TraceRef(BaseModel):
    """Reference to a Logfire trace for debugging."""

    trace_id: str = Field(..., description="OpenTelemetry trace ID (32 hex chars)")
    span_id: str = Field(..., description="OpenTelemetry span ID (16 hex chars)")
    url: str | None = Field(default=None, description="Logfire UI URL for this trace")


# ============================================================================
# Deterministic Evaluator Models
# ============================================================================


class EvaluatorResult(BaseModel):
    """Result from a deterministic evaluator."""

    evaluator_name: str = Field(..., description="Name of the evaluator")
    passed: bool = Field(..., description="Whether the check passed")
    severity: EvaluatorSeverity = Field(
        ..., description="Severity of failure if failed"
    )
    reasoning: str = Field(..., description="Explanation of the result")
    details: dict[str, list[str]] = Field(
        default_factory=dict,
        description="Additional details (e.g., lists of violations)",
    )


# ============================================================================
# LLM Judge Models

This class is important because it defines how Shotgun Tutorial: Spec-Driven Development for Coding Agents implements the patterns covered in this chapter.

evals/models.py

The EvaluatorResult class in evals/models.py handles a key part of this chapter's functionality:



class EvaluatorResult(BaseModel):
    """Result from a deterministic evaluator."""

    evaluator_name: str = Field(..., description="Name of the evaluator")
    passed: bool = Field(..., description="Whether the check passed")
    severity: EvaluatorSeverity = Field(
        ..., description="Severity of failure if failed"
    )
    reasoning: str = Field(..., description="Explanation of the result")
    details: dict[str, list[str]] = Field(
        default_factory=dict,
        description="Additional details (e.g., lists of violations)",
    )


# ============================================================================
# LLM Judge Models
# ============================================================================


class RouterDimensionRubric(BaseModel):
    """Rubric definition for a single Router evaluation dimension."""

    dimension: RouterDimension = Field(..., description="The dimension being evaluated")
    description: str = Field(..., description="What this dimension measures")
    rubric_text: str = Field(..., description="Full rubric text for the LLM judge")
    weight: float = Field(default=1.0, ge=0.0, le=2.0, description="Weight for scoring")


class DimensionScoreOutput(BaseModel):

This class is important because it defines how Shotgun Tutorial: Spec-Driven Development for Coding Agents implements the patterns covered in this chapter.

evals/models.py

The RouterDimensionRubric class in evals/models.py handles a key part of this chapter's functionality:



class RouterDimensionRubric(BaseModel):
    """Rubric definition for a single Router evaluation dimension."""

    dimension: RouterDimension = Field(..., description="The dimension being evaluated")
    description: str = Field(..., description="What this dimension measures")
    rubric_text: str = Field(..., description="Full rubric text for the LLM judge")
    weight: float = Field(default=1.0, ge=0.0, le=2.0, description="Weight for scoring")


class DimensionScoreOutput(BaseModel):
    """Structured output from LLM judge for a single dimension."""

    score: int = Field(..., ge=1, le=5, description="Score on 1-5 Likert scale")
    reasoning: str = Field(..., description="Explanation for the score")
    passed: bool = Field(
        ..., description="Whether the minimum threshold was met (score >= 3)"
    )


class AllDimensionsScoreOutput(BaseModel):
    """Structured output from LLM judge for all dimensions in one call."""

    delegation_rationale: DimensionScoreOutput = Field(
        ..., description="Score for delegation rationale quality"
    )
    context_handling: DimensionScoreOutput = Field(
        ..., description="Score for context handling"
    )
    clarity: DimensionScoreOutput = Field(..., description="Score for clarity")
    relevance: DimensionScoreOutput = Field(..., description="Score for relevance")

This class is important because it defines how Shotgun Tutorial: Spec-Driven Development for Coding Agents implements the patterns covered in this chapter.

evals/models.py

The DimensionScoreOutput class in evals/models.py handles a key part of this chapter's functionality:



class DimensionScoreOutput(BaseModel):
    """Structured output from LLM judge for a single dimension."""

    score: int = Field(..., ge=1, le=5, description="Score on 1-5 Likert scale")
    reasoning: str = Field(..., description="Explanation for the score")
    passed: bool = Field(
        ..., description="Whether the minimum threshold was met (score >= 3)"
    )


class AllDimensionsScoreOutput(BaseModel):
    """Structured output from LLM judge for all dimensions in one call."""

    delegation_rationale: DimensionScoreOutput = Field(
        ..., description="Score for delegation rationale quality"
    )
    context_handling: DimensionScoreOutput = Field(
        ..., description="Score for context handling"
    )
    clarity: DimensionScoreOutput = Field(..., description="Score for clarity")
    relevance: DimensionScoreOutput = Field(..., description="Score for relevance")


class RouterJudgeResult(BaseModel):
    """Complete result from Router quality judge evaluation."""

    dimension_scores: dict[str, DimensionScoreOutput] = Field(
        ..., description="Scores for each evaluated dimension"
    )
    overall_score: float = Field(

This class is important because it defines how Shotgun Tutorial: Spec-Driven Development for Coding Agents implements the patterns covered in this chapter.

How These Components Connect

flowchart TD
    A[TraceRef]
    B[EvaluatorResult]
    C[RouterDimensionRubric]
    D[DimensionScoreOutput]
    E[AllDimensionsScoreOutput]
    A --> B
    B --> C
    C --> D
    D --> E