Learning System

November 10, 2025 · View on GitHub

Automated knowledge capture and curation for Claude Code sessions.

Overview

The learning system automatically captures, categorizes, and organizes insights discovered during development. It helps build an organic knowledge base that grows with your project.

Key Features:

Conversational learning capture during sessions
AI-powered categorization into 6 types
Automatic duplicate detection and merging
Multi-source extraction (session summaries, git commits, inline comments)
Advanced filtering and search
Statistics dashboard and timeline views

Learning Categories

Learnings are automatically categorized into 6 types:

1. Architecture Patterns

Design decisions, patterns used, architectural approaches.

Examples:

"Using Repository Pattern for all database access ensures testability"
"Event-driven architecture chosen for microservice communication"
"CQRS pattern separates read and write models for better scalability"

2. Gotchas

Edge cases, pitfalls, bugs discovered, things that went wrong.

Examples:

"FastAPI middleware order matters for CORS - add_middleware calls in reverse order"
"SQLAlchemy lazy loading causes N+1 queries - use joinedload()"
"React useEffect with missing dependencies causes stale closures"

3. Best Practices

Effective approaches identified, recommended patterns.

Examples:

"Always use environment variables for secrets, never hardcode"
"Write integration tests for API endpoints, not just unit tests"
"Use type hints in Python for better IDE support and fewer bugs"

4. Technical Debt

Areas needing improvement, refactoring needed, temporary solutions.

Examples:

"Authentication module needs refactoring - currently too tightly coupled"
"Database migrations not versioned properly, should use Alembic"
"Error handling is inconsistent across API endpoints"

5. Performance Insights

Optimization learnings, performance improvements.

Examples:

"Database query optimization reduced API response time from 2s to 200ms"
"Redis caching for frequently accessed data cut database load by 60%"
"Debouncing search input prevents excessive API calls"

6. Security

Security-related discoveries, vulnerabilities fixed.

Examples:

"SQL injection vulnerability fixed by using parameterized queries"
"JWT tokens should expire after 1 hour to reduce attack window"
"Input validation on API endpoints prevents XSS attacks"

Commands

`/sk:learn` - Capture a Learning

Capture a learning discovered during development.

Usage:

/sk:learn

Claude will ask:

What did you learn?
Which category? (architecture_patterns, gotchas, best_practices, technical_debt, performance_insights, security)
Any tags? (optional, comma-separated)
Any additional context? (optional)

Example:

You: /sk:learn
Claude: What did you learn?
You: FastAPI middleware order matters for CORS - add_middleware calls must be in reverse order
Claude: Which category best fits this learning?
You: gotchas
Claude: Any tags to help find this later?
You: fastapi,cors,middleware
Claude: Any additional context?
You: Discovered while debugging CORS issues in session 5

✓ Learning captured!
  ID: 670b4de7
  Category: gotchas
  Tags: fastapi, cors, middleware

It will be auto-categorized and curated.

`/sk:learn-show` - Browse Learnings

View captured learnings with optional filtering.

Usage:

/sk:learn-show [--category CATEGORY] [--tag TAG] [--session SESSION]

Examples:

Show all learnings:

/sk:learn-show

Show only gotchas:

/sk:learn-show --category gotchas

Show learnings tagged with "fastapi":

/sk:learn-show --tag fastapi

Show learnings from session 5:

/sk:learn-show --session 5

Combine filters:

/sk:learn-show --category gotchas --tag fastapi

`/sk:learn-search` - Search Learnings

Full-text search across all learning content, tags, and context.

Usage:

/sk:learn-search <query>

Examples:

/sk:learn-search CORS
/sk:learn-search "middleware order"
/sk:learn-search authentication

`/sk:learn-curate` - Manual Curation

Run the curation process manually to categorize, detect duplicates, and merge similar learnings.

Usage:

/sk:learn-curate [--dry-run]

Dry-run mode (preview only, no changes saved):

/sk:learn-curate --dry-run

Normal mode (save changes):

/sk:learn-curate

What curation does:

Categorizes uncategorized learnings using AI-powered keyword analysis
Detects duplicates using Jaccard and containment similarity algorithms
Merges similar learnings to reduce redundancy
Archives old learnings (older than 50 sessions)
Updates metadata (last_curated timestamp)

Automatic Curation

Curation runs automatically every N sessions (configurable in .session/config.json).

Note: The config.json file is automatically created when you run /sk:init to initialize the project.

Configuration:

{
  "curation": {
    "auto_curate": true,
    "frequency": 5,
    "dry_run": false,
    "similarity_threshold": 0.7
  }
}

Options:

auto_curate: Enable/disable automatic curation (default: true)
frequency: Run curation every N sessions (default: 5)
dry_run: Preview mode, don't save changes (default: false)
similarity_threshold: Similarity threshold for duplicate detection (default: 0.7)

Learning Extraction

Learnings are automatically extracted from multiple sources:

1. Session Summaries

Extracts learnings from "Challenges Encountered" and "Learnings Captured" sections in session summary files.

Example:

## Challenges Encountered
- FastAPI middleware order matters for CORS
- SQLAlchemy lazy loading caused N+1 queries

Both bullet points are automatically extracted as learnings.

2. Git Commit Messages

Extracts LEARNING: annotations from commit messages.

Example:

git commit -m "Fix CORS issue

LEARNING: FastAPI middleware order matters for CORS - add_middleware calls must be in reverse order

🤖 Generated with Claude Code"

The LEARNING annotation is automatically extracted.

3. Inline Code Comments

Extracts # LEARNING: comments from recently changed files.

Example (Python):

# LEARNING: Always use parameterized queries to prevent SQL injection
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))

The learning is automatically extracted from the code comment.

Similarity Detection

The system uses two algorithms to detect duplicate learnings:

Jaccard Similarity

Measures word overlap between two learnings.

Formula: similarity = |A ∩ B| / |A ∪ B|

Threshold: 0.6 (configurable)

Example:

Learning A: "FastAPI middleware order matters for CORS"
Learning B: "CORS middleware in FastAPI must be added in reverse order"
Similarity: ~0.7 (high overlap) → Merged

Containment Similarity

Checks if one learning is a substring of another.

Formula: containment = |A ∩ B| / min(|A|, |B|)

Threshold: 0.8

Example:

Learning A: "FastAPI middleware order matters"
Learning B: "FastAPI middleware order matters for CORS"
Containment: 1.0 (A is contained in B) → Merged

Stopword Removal

Common words (the, and, or, is, are, etc.) are removed before comparison to focus on meaningful keywords.

Advanced Features

Statistics Dashboard

View learning statistics with sk learn-statistics:

=== Learning Statistics ===

Total learnings: 45

By Category:
----------------------------------------
  Architecture Patterns              12
  Gotchas                           15
  Best Practices                    10
  Technical Debt                     5
  Performance Insights               3
  Security                           0

Top Tags:
----------------------------------------
  fastapi                           20
  python                            15
  database                          10
  authentication                     8
  testing                            7

Sessions with Most Learnings:
----------------------------------------
  Session 5                          8
  Session 12                         7
  Session 8                          6

Timeline View

View learning history by session with sk learn-timeline:

=== Learning Timeline (Last 10 Sessions) ===

Session 012: 7 learning(s)
  - Redis caching reduced database load by 60%
  - JWT tokens should expire after 1 hour
  - Input validation prevents XSS attacks
  ... and 4 more

Session 011: 3 learning(s)
  - Database query optimization reduced response time
  - Use joinedload() to prevent N+1 queries
  - Always use environment variables for secrets

Session 010: 5 learning(s)
  - FastAPI middleware order matters for CORS
  - SQLAlchemy lazy loading causes issues
  - Write integration tests for API endpoints
  ... and 2 more

Find similar learnings using curator.get_related_learnings(learning_id):

from solokit.learning.curator import LearningsCurator

curator = LearningsCurator()
related = curator.get_related_learnings("670b4de7", limit=5)

for learning in related:
    print(f"- {learning['content']}")
    print(f"  Category: {learning['category']}")
    print(f"  Similarity: {learning.get('similarity_score', 'N/A')}%")

Workflows

Workflow 1: Capturing Learnings During Development

Work on a feature/fix
Discover something worth remembering
Use /sk:learn to record it
Claude asks questions conversationally
Learning is saved and will be auto-curated

Workflow 2: Browsing Learnings

Use /sk:learn-show to see all learnings
Filter by category: /sk:learn-show --category gotchas
Filter by tag: /sk:learn-show --tag fastapi
Search for specific content: /sk:learn-search CORS

Workflow 3: Automatic Extraction

Complete a session with /sk:end
System auto-extracts learnings from:
- Session summary (Challenges section)
- Git commit messages (LEARNING: annotations)
- Inline code comments (# LEARNING:)
Extracted learnings are automatically categorized
Duplicates are skipped

Workflow 4: Curation

Capture learnings throughout multiple sessions
Every 5 sessions, auto-curation runs
Curation categorizes, detects duplicates, and merges similar learnings
Manual curation: /sk:learn-curate or /sk:learn-curate --dry-run

{
  "curation": {
    "auto_curate": true,
    "frequency": 5
  }
}

Ensure auto_curate is true and frequency is set.

Duplicates Not Being Merged

The similarity threshold may be too high. Adjust in .session/config.json:

{
  "curation": {
    "similarity_threshold": 0.6
  }
}

Lower values (0.5-0.6) detect more duplicates, higher values (0.7-0.8) are more conservative.

Extraction Not Finding Learnings

Session summaries: Ensure summaries have "## Challenges Encountered" or "## Learnings Captured" sections
Git commits: Use exact format: LEARNING: <your learning text>
Code comments: Use exact format: # LEARNING: <your learning text>

Learnings File Corrupted

The learnings file is at .session/tracking/learnings.json. If corrupted, you can reset it:

rm .session/tracking/learnings.json
sk learn-curate

This creates a fresh learnings file.

Data Storage

Learnings are stored in .session/tracking/learnings.json:

{
  "last_curated": "2025-10-13T22:30:00",
  "curator": "session_curator",
  "categories": {
    "gotchas": [
      {
        "id": "670b4de7",
        "content": "FastAPI middleware order matters for CORS",
        "timestamp": "2025-10-13T22:47:07.683518",
        "learned_in": "session_001",
        "tags": ["fastapi", "cors", "middleware"],
        "context": "Discovered while debugging CORS issues"
      }
    ],
    "architecture_patterns": [],
    "best_practices": [],
    "technical_debt": [],
    "performance_insights": [],
    "security": []
  },
  "archived": []
}

Integration with Session Workflow

The learning system is fully integrated with the session workflow:

Session Start: Load existing learnings for context
During Session: Capture learnings with /sk:learn
Session End:
- Auto-extract learnings from session summary, git commits, code comments
- Auto-curate every N sessions
- Manual learning capture
Between Sessions: Browse, search, review learnings

Command Line Interface

All learning commands are also available via CLI:

# Add a learning
sk learn-add \
  --content "Your learning here" \
  --category gotchas \
  --tags "tag1,tag2" \
  --session 5

# Show learnings
sk learn-show \
  --category gotchas \
  --tag fastapi \
  --session 5

# Search learnings
sk learn-search "CORS"

# Curate learnings
sk learn-curate
sk learn-curate --dry-run

# Statistics
sk learn-statistics

# Timeline
sk learn-timeline --sessions 10

# Report (legacy)
sk learn-report

Summary

The learning system helps you:

✅ Capture insights discovered during development
✅ Organize knowledge into 6 categories automatically
✅ Find learnings with powerful search and filtering
✅ Avoid duplicate learnings with similarity detection
✅ Extract learnings from multiple sources automatically
✅ Track learning growth over time with statistics and timeline
✅ Build an organic knowledge base that grows with your project

Start capturing learnings today with /sk:learn!