Repowise Computed Glossary

June 19, 2026 ยท View on GitHub

This glossary describes the data Repowise computes while indexing, analyzing, generating, serving, and exporting a repository. It is based on the code paths in packages/core, packages/server, and packages/cli, not only on README files.

Use this as the vocabulary map for wiki pages, graph records, risk signals, workspace overlays, MCP responses, and CLI output.

Quick Map

AreaMain code pathsWhat gets computed
Traversal and parsingpackages/core/src/repowise/core/ingestion/traverser.py, packages/core/src/repowise/core/ingestion/parser.py, packages/core/src/repowise/core/ingestion/models.pyFiles, languages, entry points, symbols, imports, exports, calls, inheritance, parse errors, content hashes
Graph constructionpackages/core/src/repowise/core/ingestion/graph.py, call_resolver.py, heritage_resolver.py, framework_edges.py, dynamic_hints/File and symbol nodes, import/call/heritage/framework/dynamic/co-change edges, centrality, SCCs, communities, execution flows
Git intelligencepackages/core/src/repowise/core/ingestion/git_indexer.pyChurn, ownership, hotspots, bus factor, co-change partners, significant commits, temporal scores, rename and merge signals
Analysispackages/core/src/repowise/core/analysis/Dead-code findings, decision records, decision staleness, security findings, PR blast radius, execution flows, communities
Generationpackages/core/src/repowise/core/generation/Wiki page contexts, page types, source hashes, summaries, freshness, confidence decay, RAG context, job checkpoints, reports, costs
Workspace intelligencepackages/core/src/repowise/core/workspace/Workspace repo scan, cross-repo co-changes, package dependencies, API contracts, contract links, workspace CLAUDE.md data
Persistence and searchpackages/core/src/repowise/core/persistence/, Alembic migrationsORM rows, FTS rows, vector records, answer cache, cost rows, graph rows
API, MCP, CLIpackages/server/src/repowise/server/, packages/cli/src/repowise/cli/Dashboard schemas, MCP tool payloads, status tables, doctor checks, exports, costs, augment hook context

Traversal And Repository Structure

TermDefinitionComputed byExample
Includable source fileA file that survives ignore rules, blocked patterns, size limit, binary detection, generated-file detection, and language detection.FileTraverser._build_file_info()packages/core/src/repowise/core/ingestion/parser.py
FileInfoPer-file metadata used by the parser and graph builder.FileTraverser.traverse(){path: "src/app.py", language: "python", is_test: false, is_entry_point: true}
Language tagCanonical language value from file extension, special filename, or shebang.ingestion/models.py, traverser.py, languages/registry.pypython, typescript, go, terraform, openapi, unknown
Test file flagWhether a file looks like a test/spec/fixture file.FileTraverser._build_file_info() and community/test-gap helperstests/test_auth.py -> is_test=true
Config file flagWhether a file is classified as configuration.FileTraverser._build_file_info()pyproject.toml -> is_config=true
API contract flagWhether a file is an API contract format.FileTraverser._build_file_info()openapi.yaml -> is_api_contract=true
Entry point flagWhether a filename or language-specific entry pattern marks a file as a starting point.FileTraverser._build_file_info()main.py, server.ts, Dockerfile depending on rules
Traversal statsCounts of included files and skip reasons.TraversalStats in traverser.py{included: 240, skipped_binary: 3, skipped_generated: 12}
Package infoA package/workspace detected from manifests near the repo root.FileTraverser._detect_monorepo(){name: "core", path: "packages/core", manifest_file: "pyproject.toml"}
Repo structureHigh-level structure summary used by overview generation.FileTraverser.get_repo_structure(){is_monorepo: true, total_files: 820, entry_points: ["packages/cli/src/.../main.py"]}
Language distributionFraction of included files by language.get_repo_structure(){"python": 0.72, "typescript": 0.18, "markdown": 0.10}
Estimated LOCFast line-count estimate from file sizes, not exact source line counting.get_repo_structure()total_loc = sum(size_bytes // 40)
Content hashSHA-256 of raw file bytes.compute_content_hash() in ingestion/models.py3f786850e387550fdab836ed7e6dc881de23001b...

Parsing, Symbols, Imports, Calls

TermDefinitionComputed byExample
ParsedFileFull parse result for one file: file metadata, symbols, imports, exports, calls, heritage, docstring, parse errors, content hash.ASTParser.parse_file()ParsedFile(symbols=[...], imports=[...], calls=[...])
SymbolA function, class, method, interface, enum, constant, type alias, module, macro, variable, etc.ASTParser._extract_symbols()src/app.py::create_app
Symbol IDStable ID derived from path and name, including parent class for methods.ASTParser._extract_symbols()src/models.py::User::save
Qualified nameDot-form symbol name derived from path and parent._build_qualified_name()src.models.User.save
Symbol kindCanonical symbol type.LanguageConfig.symbol_node_types plus refinersfunction, class, method, interface, struct, trait
SignatureCompact declaration text.build_signature() via parser extractorsdef create_app(config: Config) -> FastAPI
Symbol docstringHuman text attached to a symbol, when extractable.extract_symbol_docstring()"Create and configure the API app."
Module docstringFile-level docstring.extract_module_docstring()"Command-line entry points."
VisibilityPublic/private/protected/internal classification.Language-specific visibility helpers_helper -> private, UserService -> public
Async flagWhether a symbol is async._is_async_node()async def fetch() -> is_async=true
Complexity estimateSymbol complexity field, persisted to symbols.Parser/model pipeline; defaults to 1 unless language extraction enriches itcomplexity_estimate: 3
DecoratorsDecorator/modifier strings captured with a symbol.ASTParser._extract_symbols()["@router.get('/users')"]
ImportRaw import statement plus normalized module path and imported names.ASTParser._extract_imports(){raw_statement: "from .db import Session", module_path: ".db", imported_names: ["Session"]}
Named bindingAlias-aware import binding.extract_import_bindings(){local_name: "np", exported_name: null, is_module_alias: true}
Resolved importImport whose module path was matched to a repo file.GraphBuilder.build() through resolve_import()from .models import User -> src/models.py
Export listPublic top-level symbol names exported by a file.ASTParser._derive_exports()["create_app", "Settings"]
Call siteRaw function or method call extracted from the AST.ASTParser._extract_calls(){target_name: "save", receiver_name: "user", line: 42, argument_count: 1}
Enclosing caller symbolThe symbol that contains a call site._find_enclosing_symbol()src/app.py::main
Heritage relationRaw inheritance or implementation relationship.extract_heritage()OrderController extends BaseController
Parse errorNon-fatal syntax/tree-sitter error description._collect_error_nodes()Parse error at line 17

Graph Entities And Edges

TermDefinitionComputed byExample
Dependency graphDirected NetworkX graph containing file nodes, symbol nodes, and edge metadata.GraphBuildernx.DiGraph with nodes src/app.py, src/app.py::main
File nodeGraph node for a source file.GraphBuilder.add_file(){node_type: "file", language: "python", symbol_count: 8}
Symbol nodeGraph node for an extracted symbol.GraphBuilder.add_file(){node_type: "symbol", kind: "function", name: "main"}
External nodeNode for third-party or unresolvable dependencies.Import resolution pathsexternal:react
Synthetic module symbolSymbol node for top-level calls in a file.GraphBuilder.add_file()src/app.py::__module__
defines edgeFile-to-symbol containment.GraphBuilder.add_file()src/app.py -> src/app.py::main
imports edgeFile-to-file import relationship.GraphBuilder.build()src/app.py -> src/settings.py
imported_names edge payloadNames imported along an import edge.GraphBuilder.build()["Settings", "load_config"]
has_method edgeClass-to-method containment.GraphBuilder.add_file()src/models.py::User -> src/models.py::User::save
calls edgeSymbol-to-symbol call relationship.CallResolver, then GraphBuilder._resolve_calls()src/app.py::main -> src/db.py::connect
Call confidenceConfidence that a call edge points to the right callee.CallResolver0.95 same-file, 0.90 import binding, 0.50 global unique
extends edgeClass/struct inheritance edge.HeritageResolverUserView -> BaseView
implements edgeInterface/trait implementation edge.HeritageResolverUserRepository -> Repository
Heritage confidenceConfidence that inheritance/implementation resolved correctly.HeritageResolver0.95 same-file, 0.90 imported, 0.50 global unique
framework edgeSynthetic edge from framework conventions.framework_edges.pyurls.py -> views.py, app.py -> routers/users.py
Dynamic edgeEdge inferred from runtime/dynamic patterns.dynamic_hints/* and GraphBuilder.add_dynamic_edges(){edge_type: "dynamic_imports", hint_source: "django", weight: 1.0}
co_changes edgeFile-to-file historical coupling edge.GraphBuilder.add_co_change_edges() from git metadatasrc/a.py -> src/b.py with weight: 4.2
Stem mapImport-stem to candidate file path lookup used for import resolution.GraphBuilder._build_stem_map(){"models": ["src/models.py", "tests/models.py"]}
File subgraphFile-only graph used for PageRank and betweenness.GraphBuilder.file_subgraph()All file/external nodes, excluding co_changes edges
PageRankFile centrality in the import graph.GraphBuilder.pagerank()0.01842
BetweennessHow often a file sits on shortest paths.GraphBuilder.betweenness_centrality()0.0067
SCCStrongly connected component, used to detect dependency cycles.GraphBuilder.strongly_connected_components(){"src/a.py", "src/b.py"}
SCC page groupNon-singleton SCC that gets a cycle page.PageGenerator.generate_all()scc-3
Graph JSONNode-link serialization of the graph.GraphBuilder.to_json(){"directed": true, "nodes": [...], "links": [...]}

Communities And Execution Flows

TermDefinitionComputed byExample
File communityCluster of related production files, with tests assigned to their most-related production community.detect_file_communities()community_id: 2
Symbol communityCluster of symbol nodes based on call and heritage edges.detect_symbol_communities()symbol_community_id: 5
Community algorithmPartition algorithm used.communities._partition()leiden, louvain, none, failed
Oversized community splitSecond partition pass for communities larger than a graph fraction._split_oversized()A 300-file cluster split into smaller clusters
Community labelHuman label derived from non-generic path segments or filename keywords._heuristic_label()api/routes, auth, payments
Community cohesionRatio of actual intra-community edges to possible edges._cohesion_score()0.2143
Dominant languageMost common language among community members._dominant_language()python
Neighboring communityAdjacent community from graph edges, surfaced by MCP/API.tool_community.py, graph routers{community_id: 4, edge_count: 9}
Entry point score0 to 1 score for a function/method as an execution start._score_entry_point()0.735 for main()
Entry point score signalsWeighted fan-out, low in-degree, visibility, name pattern, and file entry flag._score_entry_point()public main() with many calls scores high
Execution flowBFS trace following high-confidence call edges from an entry point.trace_execution_flows()main -> load_config -> connect_db
Cross-community flowExecution flow that visits more than one community._bfs_trace()communities_visited: [0, 3]
Flow depthNumber of call hops in a traced flow._bfs_trace()depth: 4
Flow deduplicationKeeps the longest flow per shared first-three-node prefix._deduplicate_flows()Two main -> route -> handler traces collapse to one

Git Intelligence

TermDefinitionComputed byExample
Git metadata rowPer-file history, ownership, churn, and coupling record.GitIndexer.index_repo() and _index_file()One git_metadata row for src/app.py
Commit countsTotal, 90-day, and 30-day commit volumes._index_file(){commit_count_total: 87, commit_count_90d: 12, commit_count_30d: 3}
Commit count cappedWhether the history reached the configured commit limit._index_file()true when len(commits) >= 500
First/last commit timestampsOldest and newest commit timestamps for a file._index_file()first_commit_at: 2024-05-03T10:00:00Z
File age daysDays since first commit._index_file()age_days: 455
Primary ownerDominant owner by blame when available, otherwise by commit count._get_blame_ownership() and _index_file(){name: "Asha", email: "asha@example.com", pct: 0.64}
Top authorsTop five authors by commit count._index_file()[{name: "Asha", commit_count: 20}]
Recent ownerDominant committer in the last 90 days._index_file()recent_owner_name: "Sam"
Contributor countNumber of distinct authors._index_file()contributor_count: 6
Bus factorNumber of contributors needed to account for 80 percent of commits._index_file()bus_factor: 2
Significant commitsFiltered, non-noise commit messages useful for decisions and risk._is_significant_commit()[{sha: "a1b2c3d4", message: "migrate auth to JWT"}]
PR numberPR/MR number extracted from significant commit messages._PR_NUMBER_RE in git_indexer.pypr_number: 128
Commit categoriesMessage classification counts._COMMIT_CATEGORIES in git_indexer.py{"feature": 4, "fix": 11, "refactor": 2}
Lines added/deleted 90dRecent churn by numstat._index_file(){lines_added_90d: 340, lines_deleted_90d: 87}
Average commit size(lines_added_90d + lines_deleted_90d) / commit_count_90d._index_file()35.6
Merge commit count 90dNumber of merge commits touching the file recently._index_file()merge_commit_count_90d: 2
Original pathEarliest path found through rename-follow history._detect_original_path()legacy/auth/session.py
Temporal hotspot scoreExponentially decayed churn score with 180-day half-life._index_file()2.43
Churn percentileRank percentile among indexed files by temporal hotspot score, with 90-day commits as tiebreak._compute_percentiles()0.88
Hotspot flagTop churn file: percentile >= 0.75 and has recent commits._compute_percentiles()is_hotspot: true
Stable file flagFile with more than 10 total commits and no recent 90-day commits._index_file()is_stable: true
Co-change partnerFile historically changed in the same commits, with temporal decay._compute_co_changes(){file_path: "src/schema.py", co_change_count: 3.72, last_co_change: "2026-04-14"}
Agent provenance (commit)Which coding agent (if any) authored a commit, from local-git channels only (identity fields, message footers, co-author trailers); tier 1 = near-autonomous bot account, 2 = human-driven agent, 3 = assisted.agent_provenance.AgentProvenanceClassifier.classify(){agent_name: "claude", agent_autonomy_tier: 2, agent_channel: "message_footer", agent_confidence: "high"}
Agent-authored share (file)Fraction of a file's indexed commits that are agent-attributed, with per-tier counts._index_file(){agent_authored_pct: 0.42, agent_commit_count: 21, agent_tier_counts: {"2": 18, "3": 3}}
Git index summaryRepo-level indexing result.GitIndexSummary{files_indexed: 420, hotspots: 38, stable_files: 71, duration_seconds: 12.4}

Generated Wiki Pages

TermDefinitionComputed byExample
Page typeKind of generated documentation page.PageType in generation/models.pyfile_page, module_page, repo_overview
Generation levelOrdered generation tier for page dependencies.GENERATION_LEVELSapi_contract: 0, file_page: 2, repo_overview: 6
Generated pageMarkdown wiki page plus metadata and token counts.GeneratedPage and PageGenerator._build_generated_page(){page_id: "file_page:src/app.py", title: "File: src/app.py"}
Page IDDeterministic natural key.compute_page_id()symbol_spotlight:src/app.py::create_app
Source hashSHA-256 of rendered prompt/source context for freshness comparisons.compute_source_hash()64-character hex
Page summaryDeterministic first prose paragraph or overview excerpt.PageGenerator._extract_summary()"This file wires the CLI command group and registers subcommands."
Freshness statusWhether a page still matches current source and age thresholds.compute_freshness()fresh, stale, expired
Confidence decayLinear decay from 1.0 to 0.0 over expiry days.decay_confidence()0.77 after part of the expiry window
Git-adjusted confidence decayMultiplier adjusted by hotspot/stable state and commit message intent.compute_confidence_decay_with_git()Direct refactor on hotspot decays faster
Prompt cache keySHA-256 of model, language, page type, and prompt.PageGenerator._compute_cache_key()9e107d9d372bb6826bd81d3542a419d6...
Cached tokensTokens served from provider cache.Provider response, persisted on pages and reportcached_tokens: 12000
Hallucination warningLLM output mentions symbol-like backticks not found in parsed symbols._validate_symbol_references()Unknown symbol: "run_worker"
Generation reportRun summary by page type, tokens, stale pages, dead-code count, decision count, warnings, elapsed time.GenerationReport.from_pages(){pages_by_type: {"file_page": 45}, total_input_tokens: 980000}
Estimated generation costToken estimate using USD per 1M-token rates.GenerationReport.estimated_cost_usd() and CLI cost_estimator.py$2.3400
Generation job checkpointJSON state for resumable generation.JobSystem{status: "running", completed_pages: 12, current_level: 2}
Generation statusJob lifecycle state.JobSystem and GenerationJob ORMpending, running, completed, failed, paused

Page Contexts

TermDefinitionComputed byExample
File page contextTemplate data for one important source file.ContextAssembler.assemble_file_page(){file_path, symbols, imports, dependencies, pagerank_score}
Symbol spotlight contextTemplate data for a top public symbol.assemble_symbol_spotlight()create_app with signature, source body, callers
Module page contextAggregate context for top-level directory/module.assemble_module_page(){module_path: "packages/core", total_symbols: 780}
SCC page contextContext for a circular dependency cycle.assemble_scc_page()cycle_description: "Circular dependency cycle: a.py -> b.py"
Repo overview contextWhole-repo summary context.assemble_repo_overview()language_distribution, top_files_by_pagerank, circular_dependency_count
Architecture diagram contextTop PageRank nodes, selected edges, communities, SCC groups.assemble_architecture_diagram()Mermaid graph inputs for 50 nodes and 200 edges
API contract contextRaw API contract plus endpoint/schema hints.assemble_api_contract()endpoints: ["GET /users"], schemas: ["User"]
Infra page contextRaw infra file plus target names.assemble_infra_page()Dockerfile, Makefile, terraform files
Diff summary contextChanged files, symbol diffs, affected pages, trigger commit/diff.assemble_diff_summary(){added_files: ["src/new.py"], affected_page_ids: [...]}
Cross-package contextMonorepo boundary summary between packages.assemble_cross_package(){source_package: "cli", target_package: "core", coupling_strength: 5}
Dependency summariesSummaries of already-generated dependency pages.assemble_file_page() with page_summaries{ "src/db.py": "Database access layer..." }
RAG contextSnippets from vector search for related generated pages._generate_file_page_from_ctx()["[file_page:src/schema.py]\nDefines API schema..."]
Token estimatelen(text) // 4 heuristic.ContextAssembler._estimate_tokens()3200
Structural summary modeLarge-file outline instead of raw source snippet._build_structural_summary()[Large file - structural summary mode]
Significant fileFile selected for its own file_page._is_significant_file()Entry point, top PageRank, bridge file, package __init__.py, or test with symbols
Top symbol selectionPublic symbols selected by their file PageRank and percentile budget.PageGenerator.generate_all()Top 10 percent of public symbols, capped by page budget
Page budgetHard cap max(50, int(num_files * max_pages_pct)).PageGenerator.generate_all()800 files with 10 percent cap -> 80-page budget

Dead Code

TermDefinitionComputed byExample
Dead-code findingA graph/git finding persisted to dead_code_findings.DeadCodeAnalyzer{kind: "unused_export", file_path: "src/api.py", confidence: 0.7}
Unreachable fileFile with no incoming imports, not an entry point/test/config/contract/whitelisted file._detect_unreachable_files()src/legacy_adapter.py
Unused exportPublic symbol in an imported file that no importer names._detect_unused_exports()symbol_name: "OldClient"
Unused internalPrivate/internal symbol with no incoming calls edges._detect_unused_internals()_parse_legacy_token
Zombie packageMonorepo top-level package with no external package importers._detect_zombie_packages()packages/old-sdk
Dead-code confidenceHeuristic certainty based on age, recent commits, importers, dynamic imports, and deprecation hints.DeadCodeAnalyzer1.0 for year-old unreachable file
Safe-to-delete flagWhether confidence passes delete threshold and dynamic patterns do not block deletion._make_unreachable_finding() and other passessafe_to_delete: true
Dead-code evidenceHuman-readable reasons for the finding.DeadCodeAnalyzer["in_degree=0 (no files import this)", "No commits in last 90 days"]
Estimated deletable linesSum of line estimates for safe findings.DeadCodeAnalyzer.analyze()deletable_lines: 420
Confidence summaryCounts of high, medium, low confidence findings.DeadCodeAnalyzer.analyze(){"high": 12, "medium": 8, "low": 0}
Finding statusUser triage status persisted in DB.DeadCodeFinding.statusopen, acknowledged, resolved, false_positive

Decisions And Governance

TermDefinitionComputed byExample
Decision recordADR-like row from code comments, git, docs, or CLI/manual entry.DecisionExtractor, CRUD, CLI{title: "Use Redis for sessions", status: "active"}
Inline marker decisionDecision extracted from comments such as WHY:, DECISION:, TRADEOFF:, ADR:.scan_inline_markers()# DECISION: cache auth sessions in Redis
Git archaeology decisionLLM-structured decision inferred from significant commit messages with decision keywords.mine_git_archaeology()migrate from REST client to generated OpenAPI client
README-mined decisionDecision extracted from docs such as README, CLAUDE, ARCHITECTURE, DESIGN.mine_readme_docs()"We use SQLite by default because setup should be local-first."
Decision sourceProvenance of a record.DecisionRecord.sourceinline_marker, git_archaeology, readme_mining, cli
Decision confidenceSource-specific extraction confidence.DecisionExtractor0.95 inline LLM, 0.70 git signal, 0.60 README mining, 1.0 manual
Affected filesFiles linked to a decision from graph neighbors, commit files, or manual input.DecisionExtractor["src/auth.py", "src/session.py"]
Affected modulesTop-level modules inferred from affected files or text._infer_modules()["src", "packages"]
Decision tagsTopic labels inferred from keywords or LLM output._infer_tags() and promptsauth, database, api, security, testing
Decision statusLifecycle state.DecisionRecord.statusproposed, active, deprecated, superseded
Decision staleness score0 to 1 score indicating code has moved since a decision.DecisionExtractor.compute_staleness() and crud.recompute_decision_staleness()0.63
Conflict boostStaleness increase when newer commit messages contain contradiction signals and overlap decision text.compute_staleness()+0.3 for "migrate away" touching the same concept
Decision health summaryCounts and lists for stale, proposed, and ungoverned hotspots.get_decision_health_summary() and server/CLI routes{active: 10, stale: 2, proposed: 3}
Ungoverned hotspotHot file without related architectural decision coverage.Decision health computationsrc/payments/processor.py

Security Findings

TermDefinitionComputed byExample
Security findingRegex or symbol-name signal persisted to security_findings.SecurityScanner.scan_file(){kind: "hardcoded_secret", severity: "high", line: 12}
High severity findingDangerous execution, deserialization, shell, or hardcoded secret/password pattern._PATTERNS in security_scan.pyeval_call, pickle_loads, hardcoded_password
Medium severity findingSQL construction or TLS verification issue._PATTERNSfstring_sql, concat_sql, tls_verify_false
Low severity findingWeak hash or security-sensitive symbol name._PATTERNS and symbol scanweak_hash, security_sensitive_symbol
Security snippetTrimmed source line or symbol name for context.SecurityScanner.scan_file()password = "admin"

Risk And Blast Radius

TermDefinitionComputed byExample
File risk scorePagerank centrality multiplied by 1 + temporal_hotspot_score.PRBlastRadiusAnalyzer._score_file()0.018 * (1 + 2.4) = 0.0612
Overall PR risk score0 to 10 composite using average direct risk, max direct risk, and transitive breadth._compute_overall_risk()7.25
Transitive affected fileImporter reached by reverse BFS from changed files._transitive_affected(){path: "src/api.py", depth: 2}
Co-change warningHistorical co-change partner missing from a PR/change set._cochange_warnings(){changed: "src/a.py", missing_partner: "src/b.py", score: 4.2}
Recommended reviewerOwner aggregate over changed and affected files._recommend_reviewers(){email: "asha@example.com", files: 7, ownership_pct: 0.63}
Test gapFile lacking a matching test path by basename conventions._find_test_gaps() and MCP _check_test_gap()src/auth.py -> true
Risk trendVelocity from 30-day vs prior 60-day commit rates.tool_risk._compute_trend()increasing, stable, decreasing
Risk typeHuman bucket for the kind of risk.tool_risk._classify_risk_type()bug-prone, churn-heavy, bus-factor-risk, high-coupling, stable
Change patternHuman label from dominant commit category.tool_risk._derive_change_pattern()feature-active, fix-heavy, dependency-churn, mixed-activity
Impact surfaceTop critical reverse dependencies within two hops.tool_risk._compute_impact_surface()[{file_path: "src/api.py", pagerank: 0.05}]
Risk summaryOne-line synthesized risk sentence for MCP.tool_risk._assess_one_target()src/auth.py - hotspot score 88% (increasing), 6 dependents...
Top hotspotsHighest churn/hotspot files returned for context.get_risk()[{file_path: "src/db.py", hotspot_score: 0.94}]

Search, Answer Cache, And Retrieval

TermDefinitionComputed byExample
Search resultUnified full-text or vector result.SearchResult in persistence/search.py{page_id, title, page_type, target_path, score, snippet, search_type}
FTS5 queryStop-word-stripped OR prefix query for SQLite._build_fts5_query()"auth"* OR "session"*
FTS scorePositive score from negated SQLite rank or Postgres ts_rank.FullTextSearch0.734
Vector scoreCosine similarity between query embedding and page embedding.InMemoryVectorStore.search() and other vector stores0.812
SnippetFirst 200 chars of indexed content._snippet() or vector metadata"This module handles..."
Answer cache rowCached MCP answer payload.tool_answer.py and AnswerCache ORM{question_hash, payload_json, provider_name, model_name}
Question hashSHA-256 of normalized question text.tool_answer._hash_question()Same hash for "How auth works?" with extra whitespace/case
Answer payloadCached get_answer result.get_answer(){answer, citations, confidence, fallback_targets, retrieval}
Retrieval hitSearch hit hydrated with page metadata and summary.tool_answer.py retrieval pipeline{target_path: "src/auth.py", score: 3.2, summary: "..."}
Retrieval dominanceGating logic comparing top and second search scores.tool_answer.pyTop score high enough to answer from dominant hit
Federated RRF scoreReciprocal rank fusion score for workspace search across repos.tool_search.pyrrf_score: 0.0164
Confidence scoreNormalized workspace search confidence.tool_search.pyconfidence_score: 0.87

Persistence Tables And Stored Entities

Table or storeComputed contentExample
repositoriesRepo identity plus current indexed head_commit and settings JSON.{name: "repowise", default_branch: "main"}
generation_jobsLong-running generation progress.{status: "running", total_pages: 120, completed_pages: 31}
wiki_pagesCurrent generated markdown pages and freshness metadata.file_page:src/app.py
wiki_page_versionsArchived historical snapshots on regeneration.version: 3
graph_nodesFile and symbol nodes with graph metrics and community metadata.{node_id: "src/app.py", pagerank: 0.02}
graph_edgesTyped relationships with imported names and confidence.{source: "src/app.py", target: "src/db.py", edge_type: "imports"}
wiki_symbolsParsed symbols projected into DB.{symbol_id: "src/app.py::main", kind: "function"}
git_metadataPer-file history, churn, ownership, hotspots, co-changes.{file_path: "src/app.py", is_hotspot: true}
decision_recordsExtracted/manual architectural decisions and staleness.{title: "Use Postgres for production", status: "active"}
dead_code_findingsDead-code analyzer findings and triage status.{kind: "unreachable_file", safe_to_delete: true}
security_findingsStatic security signals.{kind: "eval_call", severity: "high"}
llm_costsPer-call token and USD cost rows.{operation: "doc_generation", input_tokens: 2500, cost_usd: 0.012}
answer_cacheCached MCP answer payloads keyed by normalized question.{question: "How does auth work?", question_hash: "..."}
conversations and chat_messagesChat state and structured message JSON.{role: "assistant", content_json: {...}}
webhook_eventsReceived external events and processing status.{provider: "github", event_type: "push", processed: false}
SQLite page_ftsFTS5 mirror of page title/content.Used by full-text search
Postgres wiki_pages.embeddingpgvector embedding column, conditionally added by migration.1536-dim vector
LanceDB wiki_pages tableLocal vector index with page metadata.{page_id, vector, title, page_type, target_path}

LLM Cost And Provider Usage

TermDefinitionComputed byExample
Pricing tableUSD per million input/output tokens by model.generation/cost_tracker.pyclaude-sonnet-4-6: {input: 3.0, output: 15.0}
Fallback pricingDefault pricing for unknown models._get_pricing(){input: 3.0, output: 15.0}
Call cost(input_tokens * input_rate + output_tokens * output_rate) / 1_000_000.CostTracker.record()1000 in, 500 out on Sonnet -> \$0.0105
Session costCumulative USD for one tracker instance.CostTracker.session_cost2.37
Session tokensCumulative input plus output tokens.CostTracker.session_tokens845000
Cost totalsDB aggregate grouped by operation, model, or day.CostTracker.totals(){group: "file_page", calls: 42, cost_usd: 1.12}
CLI cost estimatePre-generation token/cost plan.packages/cli/src/repowise/cli/cost_estimator.py{estimated_pages: 82, estimated_cost_usd: 4.60}

Workspace Intelligence

TermDefinitionComputed byExample
Discovered repoCandidate git repo found under a workspace root.workspace/scanner.py{alias: "api", path: "services/api"}
Workspace configParsed .repowise-workspace.yaml.workspace/config.py{repos: [{alias: "web", path: "apps/web"}]}
Repo update resultPer-repo update outcome for workspace update/watch.workspace/update.py{alias: "core", updated: true, file_count: 420, symbol_count: 2100}
Cross-repo co-changeFile pair in different repos changed by same author within a time window, weighted by recency.detect_cross_repo_co_changes(){source_repo: "api", source_file: "routes/users.py", target_repo: "web", target_file: "users.tsx", strength: 1.34}
Cross-repo package dependencyManifest path dependency from one repo to another.detect_package_dependencies(){source_repo: "web", target_repo: "shared", kind: "npm_workspace"}
Cross-repo overlayJSON payload saved under workspace data dir.run_cross_repo_analysis(){co_changes: [...], package_deps: [...], repo_summaries: {...}}
Cross-repo edge countPer-repo count of co-change and package-dependency edges._build_repo_summaries(){cross_repo_edge_count: 12}
Workspace CLAUDE.md dataPer-repo summaries plus cross-repo overlays and contract links.generation/editor_files/data.py, claude_md.py{repos: [...], co_changes: [...], contract_links: [...]}

API Contracts

TermDefinitionComputed byExample
ContractProvider or consumer API endpoint/topic/service extracted from source.workspace/contracts.py and extractors{contract_id: "http::GET::/api/users/{param}", role: "provider"}
Contract typeAPI surface kind.Contract extractorshttp, grpc, topic
Contract roleWhether source provides or consumes the contract.Extractorsprovider, consumer
Contract confidenceExtraction strategy confidence.Extractors and contract matching0.8
Service boundaryMonorepo service path assigned to contracts.workspace/extractors/service_boundary.pyservices/billing
Normalized contract IDLowercase/canonical ID used for matching.normalize_contract_id()http::GET::/Api/Users/ -> http::GET::/api/users
Contract linkMatched provider-consumer pair across repos/services.match_contracts(){provider_repo: "api", consumer_repo: "web", match_type: "exact"}
Manual contract linkWorkspace-configured provider/consumer link._build_manual_links(){match_type: "manual", confidence: 1.0}
Contract storeJSON payload saved as contracts.json.run_contract_extraction(){contracts: [...], contract_links: [...]}

Knowledge Map

TermDefinitionComputed byExample
Top ownerOwner ranked by number of files primarily owned.server/services/knowledge_map.py{email: "asha@example.com", files_owned: 42, percentage: 18.6}
Knowledge siloFile where one owner has more than 80 percent ownership.compute_knowledge_map(){file_path: "src/auth.py", owner_pct: 0.91}
Onboarding targetHigh-PageRank file with few or no documentation words.compute_knowledge_map(){path: "src/core.py", pagerank: 0.04, doc_words: 0}
Documentation word countWord count of the generated file page content.compute_knowledge_map()doc_words: 640

CLI-Visible Computed Outputs

CommandComputed outputExample
repowise statusSync state, current HEAD, indexed commit, DB page counts, graph node counts, pages by type, token totals.file_page: 52, Status: 3 new commit(s)
repowise status --workspacePer-repo file/symbol counts, indexed age, HEAD short SHA, stale/up-to-date state.api 420 files 2,100 symbols 2h ago a1b2c3d stale
repowise doctorHealth checks for DB, pages, vector store, FTS, graph, stale pages, store drift, coordinator state.SQL <-> Vector Store: 3 missing
repowise searchFull-text/vector/wiki or symbol hits.score 0.83, file_page, src/auth.py
repowise dead-codeDead-code table or JSON report.unused_export src/api.py OldClient 0.70
repowise decisionDecision list, detail view, health summary, stale records, proposed records, ungoverned hotspots.Stale decisions: 2
repowise costsGrouped LLM cost totals.group=file_page, calls=45, cost=\$1.37
repowise exportMarkdown/HTML/JSON export entries, optionally decisions/dead-code/hotspots.wiki_pages.json with page metadata
repowise updateFile diffs, adaptive cascade budget, affected page plan, regenerated/decayed page counts, dead-code/decision refresh results.Adaptive cascade budget: 30
repowise reindexEmbedding/indexing progress and page counts.Indexed 430 items -> .repowise/lancedb
repowise watchDebounced changed-path batches and forwarded update output.Detected 3 changed file(s), updating...
repowise workspaceWorkspace repo discovery, config entries, update status, cross-repo hook output.Found 2 new repo(s)
repowise generate-claude-mdEditor-file data and rendered .claude/CLAUDE.md.hotspots, key_modules, decisions in markdown
repowise augmentHook-time graph/search enrichment for AI tool calls.Related files, symbols, importers, dependencies
repowise distill / expand / savedCompact errors-first command output with reversible omission markers, marker restoration, and the savings ledger rollup (DISTILL.md).[repowise#a1b2c3d4e5f6: 230 lines omitted (~6.1k tokens); ...]
repowise mcpFastMCP server exposing the computed graph/wiki/risk tools below.stdio, streamable HTTP, or SSE transport

MCP And API-Visible Computed Payloads

Tool or endpoint conceptDefinitionExample
get_answerRAG answer with citations, confidence, fallback targets, retrieval metadata, and answer-cache support.{answer: "...", confidence: "medium", citations: [...]}
search_codebaseWiki search using vector/FTS and federated workspace RRF when requested.{results: [{title, relevance_score, confidence_score}]}
get_contextCompact page, symbol, freshness, dependency, git, and cross-repo context for targets.{targets: {"src/app.py": {docs, graph, freshness}}}
get_overviewRepo or workspace overview, module map, entry points, git health, communities, and workspace footer.{summary, modules, git_health, community_summary}
get_whyDecision/governance lookup, file origin story, alignment, and decision health modes.{decisions: [...], target_context: {...}}
get_riskPer-file risk, trend, risk type, owners, co-change partners, test gaps, security signals, top hotspots, optional PR blast radius.{results: [{risk_summary, hotspot_score}], top_hotspots: [...]}
get_dead_codeTiered, grouped, and summarized dead-code findings.{summary: {total_findings: 12}, tiers: {...}}
get_dependency_pathDependency-path or bridge context between files/symbols.{path: ["src/a.py", "src/b.py"]}
get_symbolExact symbol metadata and source slice.{name: "create_app", signature: "def create_app(...)"}
get_execution_flowsEntry-point traces through call edges.{flows: [{entry_point, trace, crosses_community}]}
Blast radius APIDirect risks, transitive affected files, co-change warnings, reviewers, test gaps, overall score.{overall_risk_score: 7.25}
Knowledge map APITop owners, knowledge silos, onboarding targets.{top_owners: [...], knowledge_silos: [...]}
Cost summary APIGrouped costs and totals.{groups: [...], total_cost_usd: 3.21}
Provider APIAvailable provider/model configuration.{providers: [...], active_provider: "gemini"}

Statuses And Enumerations

DomainValues
Page freshnessfresh, stale, expired, unknown in type definitions
Job statuspending, running, completed, failed, paused
Decision statusproposed, active, deprecated, superseded
Decision sourcegit_archaeology, inline_marker, readme_mining, cli
Dead-code kindunreachable_file, unused_export, unused_internal, zombie_package
Dead-code statusopen, acknowledged, resolved, false_positive
Security severityhigh, med, low
Security kindeval_call, exec_call, pickle_loads, subprocess_shell_true, os_system, hardcoded_password, hardcoded_secret, fstring_sql, concat_sql, tls_verify_false, weak_hash, security_sensitive_symbol
Edge typeimports, defines, calls, has_method, has_property, extends, implements, method_overrides, method_implements, co_changes, framework, dynamic, plus dynamic subtypes such as dynamic_uses, dynamic_imports, dynamic_url_route
Node typefile, symbol, external
Search typevector, fulltext
Contract typehttp, grpc, topic
Contract roleprovider, consumer
Contract link match typeexact, manual
Risk trendincreasing, stable, decreasing, unknown
Risk typebug-prone, churn-heavy, bus-factor-risk, high-coupling, stable, unknown
Change patternfeature-active, primarily refactored, fix-heavy, dependency-churn, mixed-activity, uncategorized
Chat roleuser, assistant
Coordinator healthok, warning, critical

Example End-To-End Computation

For a file src/auth/session.py, a typical Repowise index can compute:

  1. FileInfo: language="python", is_test=false, is_entry_point=false.
  2. ParsedFile: symbols such as src/auth/session.py::SessionStore, imports such as from .redis import client, calls such as client.get().
  3. Graph records: a file node, symbol nodes, defines, imports, calls, and maybe framework or dynamic_* edges.
  4. Graph metrics: pagerank=0.013, betweenness=0.004, community_id=2, community_label="auth", cohesion=0.18.
  5. Git metadata: commit_count_90d=11, primary_owner_name="Asha", temporal_hotspot_score=2.1, churn_percentile=0.88, is_hotspot=true.
  6. Analysis rows: maybe a security finding hardcoded_secret, or a decision record from # DECISION: store sessions in Redis.
  7. Generated docs: file_page:src/auth/session.py, source hash, token counts, summary, freshness, and vector/FTS entries.
  8. Risk output: hotspot_score=0.88, trend increasing, risk type churn-heavy, co-change partners, test-gap flag, and an impact surface.