Claude Code Log Testing & Style Guide
May 8, 2026 · View on GitHub
This directory contains comprehensive testing infrastructure and visual documentation for the Claude Code Log template system.
Test Data (test_data/)
Representative JSONL files covering all message types and edge cases:
Note: After the module split, import paths have changed:
from claude_code_log.parser import load_transcript, extract_text_contentfrom claude_code_log.html.renderer import generate_html, format_timestampfrom claude_code_log.converter import convert_jsonl_to_html
representative_messages.jsonl
A comprehensive conversation demonstrating:
- User and assistant messages
- Tool use and tool results (success cases)
- Markdown formatting and code blocks
- Summary messages
- Multiple message interactions
edge_cases.jsonl
Edge cases and special scenarios:
- Complex markdown formatting
- Very long text content
- Tool errors and error handling
- System command messages
- Command output parsing
- Special characters and Unicode
- HTML escaping scenarios
session_b.jsonl
Additional session for testing multi-session handling:
- Different source file content
- Session divider behavior
- Cross-session message ordering
real_projects/ (Integration Test Data)
Real-world JSONL data from open-source Claude Code projects, used for integration testing:
| Project | Size | Files | Purpose |
|---|---|---|---|
-Users-dain-workspace-JSSoundRecorder | ~528KB | 11 | Small project, quick tests |
-Users-dain-workspace-coderabbit-review-helper | ~6.5MB | 40 | Empty file edge cases (9 empty files) |
-Users-dain-workspace-danieldemmel-me-next | ~1.7MB | 11 | Multi-cwd sessions, path conversion |
-Users-dain-workspace-claude-code-log-sample | ~9MB | 23 | Curated sample with size variety |
These files test:
- Multi-project hierarchy processing with
--projects-dir - Cache operations with realistic data volumes
- Edge cases: Empty files, naming ambiguity, path conversion
- CLI operations with custom projects directory
Template Tests (test_template_rendering.py)
Comprehensive unit tests that verify:
Core Functionality
- ✅ Basic HTML structure generation
- ✅ All message types render correctly
- ✅ Session divider logic (only first session shown)
- ✅ Multi-session content combining
- ✅ Empty file handling
Message Type Coverage
- ✅ User messages with markdown
- ✅ Assistant responses
- ✅ Tool use and tool results
- ✅ Error handling for failed tools
- ✅ System command messages
- ✅ Command output parsing
- ✅ Summary messages
Formatting & Safety
- ✅ Timestamp formatting
- ✅ CSS class application
- ✅ HTML escaping for security
- ✅ Unicode and special character support
- ✅ JavaScript markdown setup
Template Systems
- ✅ Transcript template (individual conversations)
- ✅ Index template (project listings)
- ✅ Project summary statistics
- ✅ Date range filtering display
Visual Style Guide (../scripts/generate_style_guide.py)
Generates comprehensive visual documentation:
Generated Files
- Main Index (
index.html) - Overview and navigation - Transcript Guide (
transcript_style_guide.html) - All message types - Index Guide (
index_style_guide.html) - Project listing examples
Coverage
The style guide demonstrates:
- 📝 Message Types: User, assistant, system, summary
- 🛠️ Tool Interactions: Usage, results, errors
- 📏 Text Handling: Long content, wrapping, formatting
- 🌍 Unicode Support: Special characters, emojis, international text
- ⚙️ System Messages: Commands, outputs, parsing
- 🎨 Visual Design: Typography, colors, spacing, responsive layout
Usage
# Generate style guides
uv run python scripts/generate_style_guide.py
# Open in browser
open scripts/style_guide_output/index.html
Running Tests
Test Categories
The project uses a categorized test system to avoid async event loop conflicts between different testing frameworks:
Test Categories
- Unit Tests (no mark): Fast, standalone tests with no external dependencies
- TUI Tests (
@pytest.mark.tui): Tests for the Textual-based Terminal User Interface - Browser Tests (
@pytest.mark.browser): Playwright-based tests that run in real browsers - Integration Tests (
@pytest.mark.integration): Tests with realistic JSONL data fromtest_data/real_projects/ - Snapshot Tests: HTML regression tests using syrupy (runs with unit tests)
Snapshot Testing
Snapshot tests capture the full HTML output and detect unintended regressions. They use syrupy with a custom serializer that normalises dynamic content (library version, tmp paths).
# Run snapshot tests
uv run pytest test/test_snapshot_html.py -v
# Update snapshots after intentional HTML changes
uv run pytest -n0 test/test_snapshot_html.py --snapshot-update
# Review changes before committing
git diff test/__snapshots__/
Snapshot files are stored in test/__snapshots__/test_snapshot_html.ambr and must be committed to version control.
When to update snapshots:
- Run tests - if they fail, review the diff
- If changes are intentional, run with
--snapshot-update - Commit updated snapshots with your code changes
Running Tests
# Run only unit tests (fast, recommended for development)
just test
# or: uv run pytest -m "not (tui or browser or integration)" -v
# Run TUI tests (isolated event loop)
just test-tui
# or: uv run pytest -m tui -v
# Run browser tests (requires Chromium)
just test-browser
# or: uv run pytest -m browser -v
# Run integration tests with realistic data
just test-integration
# or: uv run pytest -m integration -v
# Run all tests in sequence (separated to avoid conflicts)
just test-all
# Run specific test file
uv run pytest test/test_template_rendering.py -v
# Run specific test method
uv run pytest test/test_template_rendering.py::TestTemplateRendering::test_representative_messages_render -v
# Run tests with coverage
just test-cov
Why Test Categories?
The test suite is categorized because:
- TUI tests use Textual's async event loop (
run_test()) - Browser tests use Playwright's internal asyncio
- Integration tests process real-world data (slower, more comprehensive)
- pytest-asyncio manages async test execution
Running all tests together can cause "RuntimeError: This event loop is already running" conflicts. The categorization ensures reliable test execution by isolating different async frameworks.
Test Coverage
Generate detailed coverage reports:
# Run tests with coverage and HTML report
uv run pytest --cov=claude_code_log --cov-report=html --cov-report=term
# View coverage by module
uv run pytest --cov=claude_code_log --cov-report=term-missing
# Open HTML coverage report
open htmlcov/index.html
Current coverage: 78%+ across all modules:
parser.py: 81% - Data extraction and parsingrenderer.py: 86% - HTML generation and formattingconverter.py: 52% - High-level orchestrationmodels.py: 89% - Pydantic data models
Manual Testing
# Test with representative data
uv run python -c "
from claude_code_log.converter import convert_jsonl_to_html
from pathlib import Path
html_file = convert_jsonl_to_html(Path('test/test_data/representative_messages.jsonl'))
print(f'Generated: {html_file}')
"
# Test multi-session handling
uv run python -c "
from claude_code_log.converter import convert_jsonl_to_html
from pathlib import Path
html_file = convert_jsonl_to_html(Path('test/test_data/'))
print(f'Generated: {html_file}')
"
Development Workflow
When modifying templates:
- Make Changes to
claude_code_log/templates/ - Run Tests to verify functionality
- Generate Style Guide to check visual output
- Review in Browser to ensure design consistency
File Structure
test/
├── README.md # This file
├── conftest.py # Pytest configuration and fixtures
├── snapshot_serializers.py # Custom syrupy serializer for HTML normalisation
├── __snapshots__/ # Syrupy snapshot files
│ └── test_snapshot_html.ambr # HTML output snapshots
├── test_data/ # Test JSONL samples
│ ├── representative_messages.jsonl
│ ├── edge_cases.jsonl
│ ├── session_b.jsonl
│ └── real_projects/ # Realistic multi-project test data (~18MB)
│ ├── -Users-dain-workspace-JSSoundRecorder/
│ ├── -Users-dain-workspace-coderabbit-review-helper/
│ ├── -Users-dain-workspace-danieldemmel-me-next/
│ └── -Users-dain-workspace-claude-code-log-sample/
├── test_snapshot_html.py # HTML output snapshot tests
├── test_integration_realistic.py # Integration tests with real data
├── test_template_rendering.py # Template rendering tests
├── test_template_data.py # Template data structure tests
├── test_template_utils.py # Utility function tests
├── test_message_filtering.py # Message filtering tests
├── test_date_filtering.py # Date filtering tests
└── test_*.py # Additional test modules
scripts/
├── setup_test_data.sh # Copies test data from ~/.claude/projects/
├── generate_style_guide.py # Visual documentation generator
└── style_guide_output/ # Generated style guides
├── index.html
├── transcript_style_guide.html
└── index_style_guide.html
htmlcov/ # Coverage reports
├── index.html # Main coverage report
└── *.html # Per-module coverage details
Benefits
This testing infrastructure provides:
- Regression Prevention: Catch template breaking changes with snapshot testing
- Coverage Tracking: 78%+ test coverage with detailed reporting
- Module Testing: Focused tests for parser, renderer, and converter modules
- Integration Testing: Real-world data from open-source projects (~18MB)
- Visual Documentation: See how all message types render
- Development Reference: Example data for testing new features
- Quality Assurance: Verify edge cases and error handling
- Design Consistency: Maintain visual standards across updates
- CLI Testing: Full coverage of
--projects-dirand multi-project operations
The combination of unit tests, integration tests, coverage tracking, and visual style guides ensures both functional correctness and design quality across the modular codebase.