Workflows

June 26, 2026 · View on GitHub

Workflows are high-level pipelines that compose skills into end-to-end processing chains. They live in agent/extensions/workflows/.

Overview

WorkflowFileInputOutput
Briefbrief.pyVideoAssetMetadata, frames, ASR, timeline
Detaileddetailed.pyVideoAssetAll of brief + OCR, emotions, objects, translation
Indexindex.pyVideoAsset (needs analysis)FAISS index + chunk metadata
Askask.pyVideoAsset + question (needs index)Answer with evidence
Highlightshighlights.pyVideoAsset (needs analysis)Clips + optional reel
Reportreport.pyVideoAssetStructured analysis report
Analyzeanalyze.pyVideoAsset + modeRoutes to the requested workflow mode

Brief Workflow

Fast ASR-first analysis for quick video insights.

Steps:

  1. Probe video metadata (duration, resolution, fps)
  2. Parse embedded subtitles when available
  3. Extract audio and run Whisper ASR if subtitles are missing
  4. Assess transcript sufficiency using coverage and word-count thresholds
  5. Skip visual captioning when transcript coverage is sufficient, unless forced
  6. Sample and caption frames when visual processing is needed
  7. Build structured timeline (chapters + events) via LLM
  8. Optionally enhance with web search

Parameters:

  • max_frames — default 64
  • whisper_model — default "small" (set None to skip ASR)
  • force_visual — override ASR-first skipping and always run visual captioning
  • include_web_search — default False
  • direct_model / model_path — for local model loading

Detailed Workflow

Comprehensive analysis with all available skills.

Adds to brief:

  • OCR text extraction from key frames (PaddleOCR)
  • Object detection (YOLOv8) on frames
  • Emotion analysis — audio (Wav2Vec2) + visual (FER)
  • ASR translation to target language (default: Chinese)
  • Lower scene detection threshold (0.25) and more frames (128)
  • Optional long-video segment parallelism and parallel ASR

Parameters:

  • max_frames — default 128
  • All brief parameters plus advanced skill toggles

Index Workflow

Builds a FAISS semantic index for retrieval-augmented Q&A.

Steps:

  1. Load existing analysis (or auto-run detailed if missing)
  2. Chunk video content by time windows (default: 20s)
  3. Generate dense embeddings via OpenAI-compatible API
  4. Build FAISS index with L2-normalized vectors

Parameters:

  • chunk_sec — default 20
  • embed_base_url / embed_model — embedding API endpoint

Output: Index files in cache/{vid}/index_faiss/, item count, chunk metadata.

Ask Workflow

Answers natural-language questions about video content using semantic search.

Steps:

  1. Embed the question using the same embedding model
  2. Search FAISS index for top-k relevant chunks
  3. Synthesize an answer using LLM with retrieved context

Parameters:

  • question — the question to answer
  • top_k — default 5

Output:

{
  "result": {
    "answer": "...",
    "evidence": [{"start": 10.0, "end": 30.0, "frame_ids": [...], ...}]
  },
  "hits": [...]
}

Highlights Workflow

Detects high-impact segments and exports video clips.

Steps:

  1. Load analysis (or auto-run detailed if missing)
  2. Detect highlights based on information density (LLM scoring)
  3. Export individual clips via FFmpeg
  4. Optionally concatenate into a highlight reel

Parameters:

  • max_clips — default 5
  • also_make_reel — default True

Output: Clip paths, reel path, timeline mapping.

Report Workflow

Generates a comprehensive analysis report combining all sources.

Steps:

  1. Load or run analysis
  2. Extract key information: metadata, timeline, top frames, transcript
  3. Perform web search if enabled
  4. Generate intelligent recommendations via LLM

Output: Structured report JSON with sections for metadata, timeline summary, key frames, transcript highlights, web search insights, and recommendations.

Dependency Chain

brief / detailed   (standalone — no prerequisites)


     index         (requires analysis.json)


      ask          (requires FAISS index)

highlights         (requires analysis.json)
report             (requires analysis.json or runs brief internally)

Missing prerequisites are auto-generated when possible (e.g., index will run detailed if no analysis.json exists).