Sparrow

June 5, 2026 ยท View on GitHub

PyPI - Python GitHub Stars GitHub Issues Current Version License: GPL v3

Structured data extraction, instruction calling and agentic workflows with ML, LLM and Vision LLM

Sparrow is an API-first platform for enterprise document intelligence. It combines accurate structured extraction from documents (invoices, statements, tables) with workflow agents and decision agents.

๐Ÿš€ Try Sparrow Online | ๐Ÿ“– Quick Start | ๐Ÿ› ๏ธ Installation | ๐Ÿ“š Examples | ๐Ÿค– Agents


๐ŸŒŸ Sparrow

๐Ÿš€ Try Sparrow Online

Production-ready structured data extraction powered by ML, LLMs & Vision LLMs. Turn invoices, receipts, statements, forms and images into clean structured data.

Sparrow is an API-first platform built for enterprise document intelligence. It provides RESTful APIs for structured data extraction, instruction processing, and multi-agent workflow orchestration โ€” all running on your own infrastructure with no external API calls or cloud dependencies.

Platform capabilities:

  • Structured Extraction API: Submit documents via REST and receive validated JSON โ€” integrate directly into any backend or data pipeline
  • Instruction Processing: Beyond document extraction โ€” text processing, validation, and decision making via the instruction inference API
  • Agent Framework: Orchestrate multi-step workflows with custom agents, visual monitoring via Prefect, and robust error handling
  • Pluggable Pipelines: Mix and match Vision LLM (Sparrow Parse), Text LLM (Sparrow Instructor), and Agent pipelines depending on the task
  • Multiple Backends: MLX on Apple Silicon, vLLM on NVIDIA, Ollama, Hugging Face โ€” same API surface across all

Sparrow UI

Sparrow UI Features

The web UI provides a visual interface on top of the same API:

  • Drag & Drop: Upload documents directly
  • Real-time Processing: See results instantly
  • Data Query: JSON based schema for data query
  • Structured Output: JSON structured output

๐Ÿ“‘ Table of Contents

โœจ Key Features

๐ŸŽฏ Universal Document Processing: Handle invoices, receipts, forms, bank statements, tables
๐Ÿ”ง Pluggable Architecture: Mix and match different pipelines (Sparrow Parse, Instructor, Agents)
๐Ÿ–ฅ๏ธ Multiple Backends: MLX (Apple Silicon), Ollama, vLLM, Docker, Hugging Face Cloud GPU
๐Ÿ“ฑ Multi-format Support: Images (PNG, JPG) and multi-page PDFs
๐ŸŽจ Schema Validation: JSON schema-based extraction with automatic validation
๐ŸŒ API-First Design: RESTful APIs for easy integration
๐Ÿ’ฌ Instruction Calling: Text processing, validation, decision making with GPT-OSS, Mistral, Qwen 3.6, etc.
๐Ÿ“Š Visual Monitoring: Built-in dashboard and agent workflow tracking
๐Ÿ”’ Enterprise Ready: Rate limiting, usage analytics, commercial licensing available
๐Ÿš€ Local Vision LLMs: Mistral, Qwen 3.6, DeepSeek OCR, dots.ocr, Gemma 4, etc.

๐Ÿ—๏ธ Architecture

Sparrow Architecture

Core Components

ComponentPurposeUse Case
Sparrow ML LLMMain API engineDocument processing pipelines
Sparrow ParseVision LLM libraryStructured JSON extraction
Sparrow AgentsWorkflow orchestrationComplex multi-step processing
Sparrow OCRText recognitionOCR preprocessing
Sparrow UIWeb interfaceInteractive document processing

๐Ÿš€ Quickstart

Prerequisites

  • Python 3.12.10+ (use pyenv for version management)
  • macOS (for MLX backend) or Linux/Windows (for other backends)
  • GPU (make sure GPU have enough memory to run selected Vision LLM)

30-Second Setup

# 1. Install pyenv and Python 3.12.10
pyenv install 3.12.10
pyenv global 3.12.10

# 2. Create virtual environment
python -m venv .env_sparrow_parse
source .env_sparrow_parse/bin/activate  # Linux/Mac
# or .env_sparrow_parse\Scripts\activate  # Windows

# 3. Install Sparrow Parse pipeline
git clone https://github.com/katanaml/sparrow.git
cd sparrow/sparrow-ml/llm
pip install -r requirements_sparrow_parse.txt

# 4. For macOS: Install poppler for PDF processing
brew install poppler

# 5. Start the API server
python api.py

Before running pip install -r requirements_sparrow_parse.txt, check your platform. If you are on macOS and want to run MLX backend, go to requirements_sparrow_parse.txt and make sure sparrow-parse[mlx] libary reference is defined. If you are running Sparrow on Linux/Windows, make sure to use sparrow-parse library reference, this will skip MLX related libraries.

First Document Extraction

# Extract data from a bonds table
./sparrow.sh '[{"instrument_name":"str", "valuation":0}]' \
  --pipeline "sparrow-parse" \
  --options mlx \
  --options mlx-community/Qwen2.5-VL-72B-Instruct-4bit \
  --file-path "data/bonds_table.png"

Result:

{
  "data": [
    {"instrument_name": "UNITS BLACKROCK...", "valuation": 19049},
    {"instrument_name": "UNITS ISHARES...", "valuation": 83488}
  ],
  "valid": "true"
}

Use --options mlx for MLX backend, --options ollama for Ollama backend, --options vllm for vLLM backend. Make sure to provide correct Vision LLM model name, download model first separately with MLX, vLLM or Ollama.

๐Ÿ› ๏ธ Installation

Quick Setup

# 1. Clone repository
git clone https://github.com/katanaml/sparrow.git
cd sparrow

๐Ÿ“– For complete installation instructions, see our detailed environment setup guide.

Essential Steps Summary

  1. Python Environment: Install Python 3.12.10 using pyenv
  2. Virtual Environments: Create separate environments for different pipelines:
    • .env_sparrow_parse - for Sparrow Parse (Vision LLM)
    • .env_instructor - for Instructor (Text LLM)
    • .env_ocr - for OCR service (optional)
  3. System Dependencies: Install poppler for PDF processing
  4. Requirements: Install pipeline-specific dependencies, for example:

pip install -r requirements_sparrow_parse.txt

Platform-Specific Notes

macOS:

brew install poppler  # Required for PDF processing

Ubuntu/Debian:

sudo apt-get install poppler-utils libpoppler-cpp-dev

Apple Silicon: MLX backend available for optimal performance
NVIDIA/AMD GPU: Use vLLM or Ollama backend
CPU Only: Use smaller models or Hugging Face cloud backend

Verification

# Test installation
python api.py --port 8002
# Visit http://localhost:8002/api/v1/sparrow-llm/docs

๐Ÿ“š Examples

๐Ÿฆ Bank Statement Processing

Bank Statement

# Extract all data from bank statement
./sparrow.sh "*" \
  --pipeline "sparrow-parse" \
  --options mlx \
  --options mlx-community/Qwen2.5-VL-72B-Instruct-4bit \
  --file-path "data/bank_statement.pdf"
๐Ÿ“„ View Complete JSON Output
{
  "bank": "First Platypus Bank",
  "address": "1234 Kings St., New York, NY 12123",
  "account_holder": "Mary G. Orta",
  "account_number": "1234567890123",
  "statement_date": "3/1/2022",
  "period_covered": "2/1/2022 - 3/1/2022",
  "account_summary": {
    "balance_on_march_1": "\$25,032.23",
    "total_money_in": "\$10,234.23",
    "total_money_out": "\$10,532.51"
  },
  "transactions": [
    {
      "date": "02/01",
      "description": "PGD EasyPay Debit",
      "withdrawal": "203.24",
      "deposit": "",
      "balance": "22,098.23"
    },
    {
      "date": "02/02",
      "description": "AB&B Online Payment*****",
      "withdrawal": "71.23",
      "deposit": "",
      "balance": "22,027.00"
    },
    {
      "date": "02/04",
      "description": "Check No. 2345",
      "withdrawal": "",
      "deposit": "450.00",
      "balance": "22,477.00"
    },
    {
      "date": "02/05",
      "description": "Payroll Direct Dep 23422342 Giants",
      "withdrawal": "",
      "deposit": "2,534.65",
      "balance": "25,011.65"
    },
    {
      "date": "02/06",
      "description": "Signature POS Debit - TJP",
      "withdrawal": "84.50",
      "deposit": "",
      "balance": "24,927.15"
    },
    {
      "date": "02/07",
      "description": "Check No. 234",
      "withdrawal": "1,400.00",
      "deposit": "",
      "balance": "23,527.15"
    },
    {
      "date": "02/08",
      "description": "Check No. 342",
      "withdrawal": "",
      "deposit": "25.00",
      "balance": "23,552.15"
    },
    {
      "date": "02/09",
      "description": "FPB AutoPay***** Credit Card",
      "withdrawal": "456.02",
      "deposit": "",
      "balance": "23,096.13"
    },
    {
      "date": "02/08",
      "description": "Check No. 123",
      "withdrawal": "",
      "deposit": "25.00",
      "balance": "23,552.15"
    },
    {
      "date": "02/09",
      "description": "FPB AutoPay***** Credit Card",
      "withdrawal": "156.02",
      "deposit": "",
      "balance": "23,096.13"
    },
    {
      "date": "02/08",
      "description": "Cash Deposit",
      "withdrawal": "",
      "deposit": "25.00",
      "balance": "23,552.15"
    }
  ],
  "valid": "true"
}

๐Ÿ“Š Financial Tables

Bonds Table

# Extract structured data from financial table
./sparrow.sh '[{"instrument_name":"str", "valuation":0}]' \
  --pipeline "sparrow-parse" \
  --options mlx \
  --options mlx-community/Qwen2.5-VL-72B-Instruct-4bit \
  --file-path "data/bonds_table.png"
๐Ÿ“„ View JSON Output
{
  "data": [
    {
      "instrument_name": "UNITS BLACKROCK FIX INC DUB FDS PLC ISHS EUR INV GRD CP BD IDX/INST/E",
      "valuation": 19049
    },
    {
      "instrument_name": "UNITS ISHARES III PLC CORE EUR GOVT BOND UCITS ETF/EUR",
      "valuation": 83488
    },
    {
      "instrument_name": "UNITS ISHARES III PLC EUR CORP BOND 1-5YR UCITS ETF/EUR",
      "valuation": 213030
    },
    {
      "instrument_name": "UNIT ISHARES VI PLC/JP MORGAN USD E BOND EUR HED UCITS ETF DIST/HDGD/",
      "valuation": 32774
    },
    {
      "instrument_name": "UNITS XTRACKERS II SICAV/EUR HY CORP BOND UCITS ETF/-1D-/DISTR.",
      "valuation": 23643
    }
  ],
  "valid": "true"
}

๐Ÿงพ Invoice Processing

# Extract invoice with cropping for better accuracy
./sparrow.sh "*" \
  --pipeline "sparrow-parse" \
  --options mlx \
  --options mlx-community/Qwen2.5-VL-72B-Instruct-4bit \
  --crop-size 60 \
  --file-path "data/invoice.pdf"
๐Ÿ“„ View Complete JSON Output
{
  "invoice_number": "61356291",
  "date_of_issue": "09/06/2012",
  "seller": {
    "name": "Chapman, Kim and Green",
    "address": "64731 James Branch, Smithmouth, NC 26872",
    "tax_id": "949-84-9105",
    "iban": "GB50ACIE59715038217063"
  },
  "client": {
    "name": "Rodriguez-Stevens",
    "address": "2280 Angela Plain, Hortonshire, MS 93248",
    "tax_id": "939-98-8477"
  },
  "items": [
    {
      "description": "Wine Glasses Goblets Pair Clear",
      "quantity": 5,
      "unit": "each",
      "net_price": 12.0,
      "net_worth": 60.0,
      "vat_percentage": 10,
      "gross_worth": 66.0
    },
    {
      "description": "With Hooks Stemware Storage Multiple Uses Iron Wine Rack Hanging",
      "quantity": 4,
      "unit": "each", 
      "net_price": 28.08,
      "net_worth": 112.32,
      "vat_percentage": 10,
      "gross_worth": 123.55
    },
    {
      "description": "Replacement Corkscrew Parts Spiral Worm Wine Opener Bottle Houdini",
      "quantity": 1,
      "unit": "each",
      "net_price": 7.5,
      "net_worth": 7.5,
      "vat_percentage": 10,
      "gross_worth": 8.25
    },
    {
      "description": "HOME ESSENTIALS GRADIENT STEMLESS WINE GLASSES SET OF 4 20 FL OZ (591 ml) NEW",
      "quantity": 1,
      "unit": "each",
      "net_price": 12.99,
      "net_worth": 12.99,
      "vat_percentage": 10,
      "gross_worth": 14.29
    }
  ],
  "summary": {
    "total_net_worth": 192.81,
    "total_vat": 19.28,
    "total_gross_worth": 212.09
  }
}

๐Ÿ“„ Multi-page PDF Processing

# Process multi-page PDF with structured output per page
./sparrow.sh '{"table": [{"description": "str", "latest_amount": 0, "previous_amount": 0}]}' \
  --pipeline "sparrow-parse" \
  --options mlx \
  --options mlx-community/Qwen2.5-VL-72B-Instruct-4bit \
  --file-path "data/financial_report.pdf" \
  --debug-dir "debug/"
๐Ÿ“„ View JSON Output
[
    {
        "table": [
            {
                "description": "Revenues",
                "latest_amount": 12453,
                "previous_amount": 11445
            },
            {
                "description": "Operating expenses",
                "latest_amount": 9157,
                "previous_amount": 8822
            }
        ],
        "valid": "true",
        "page": 1
    },
    {
        "table": [
            {
                "description": "Revenues", 
                "latest_amount": 12453,
                "previous_amount": 11445
            },
            {
                "description": "Operating expenses",
                "latest_amount": 9157,
                "previous_amount": 8822
            }
        ],
        "valid": "true",
        "page": 2
    }
]

๐Ÿ’ฌ Text Instruction Processing

# Instruction-based processing
./sparrow.sh "instruction: do arithmetic operation, payload: 2+2=" \
  --pipeline "sparrow-instructor" \
  --options mlx \
  --options lmstudio-community/Mistral-Small-3.2-24B-Instruct-2506-8bit

# Instruction processing with document input
./sparrow.sh "check if business entity Chapman, Kim and Green is invoice issuing party" 
  --pipeline "sparrow-parse" 
  --instruction 
  --options mlx --options lmstudio-community/Mistral-Small-3.2-24B-Instruct-2506-8bit 
  --file-path "invoice_1.jpg"

JSON Output:

The result of 2 + 2 is:

4

๐Ÿ“ˆ Stock Data Function Calling

# Function calling example
./sparrow.sh assistant --pipeline "stocks" --query "Oracle"

JSON Output:

{
  "company": "Oracle Corporation",
  "ticker": "ORCL"
}

Additional Output:

The stock price of the Oracle Corporation is 186.3699951171875. USD

๐Ÿงพ Table/Form Processing with Sparrow Template

./sparrow.sh "*" --pipeline "sparrow-parse" \
  --debug --table --table-template "sparrow_generic_table" \
  --options mlx --options mlx-community/Qwen3.6-35B-A3B-8bit \
  --options mlx --options mlx-community/dots.ocr-bf16 --file-path "data/well_report.jpg"   

๐Ÿงพ Query Hints

./sparrow.sh "[{\"instrument_name\":\"str\", \"valuation\":\"int\"}]" \
  --pipeline "sparrow-parse" --debug --options mlx \
  --options mlx-community/gemma-4-31b-it-8bit \
  --file-path "data/bonds_table.png" --hints-file-path "data/llm_hints_eu.json"  

๐Ÿ’ป CLI Usage

Basic Syntax

./sparrow.sh "<JSON_SCHEMA>" --pipeline "<PIPELINE>" [OPTIONS] --file-path "<FILE>"

Command Line Arguments

ArgumentTypeDescriptionExample
queryJSON/StringSchema or instruction'[{"field":"str"}]'
--pipelineStringPipeline to usesparrow-parse
--file-pathPathInput documentdata/invoice.pdf
--hints-file-pathPathQuery hintsdata/hints.json
--optionsStringBackend configurationmlx,model-name
--instructionBooleanSparrow query will be used as instruction--instruction
--validationBooleanSparrow query will be used for field validation--validation
--markdownBooleanMarkdown pre-processing--markdown
--ocrBooleanExperimental functionality--ocr
--tableBooleanExperimental functionality--table
--table-templateStringExperimental functionality--name
--crop-sizeIntegerBorder cropping pixels60
--page-typeStringPage classificationfinancial_table
--debugBooleanEnable debug mode--debug
--debug-dirPathDebug output folder./debug/

Pipeline Options

Sparrow Parse (Vision LLM)

# MLX Backend (Apple Silicon)
./sparrow.sh '[{"instrument_name":"str", "valuation":0}]' \
  --pipeline "sparrow-parse" \
  --options mlx \
  --options mlx-community/Qwen3.6-35B-A3B-8bit \
  --file-path "data/bonds_table.png"

# Hugging Face Cloud GPU
--options huggingface --options your-space/model-name

# Additional flags
--options tables_only        # Extract only tables
--options validation_off     # Disable schema validation
--options apply_annotation   # Include bounding boxes
--page-type financial_table  # Classify page type

Sparrow Instructor (Text LLM)

# Instruction-based processing
./sparrow.sh "instruction: do arithmetic operation, payload: 2+2=" \
  --pipeline "sparrow-instructor" \
  --options mlx \
  --options lmstudio-community/Mistral-Small-3.2-24B-Instruct-2506-8bit

Advanced Examples

# Multi-page PDF with page classification
./sparrow.sh "*" \
  --page-type invoice \
  --page-type table \
  --pipeline "sparrow-parse" \
  --options mlx \
  --options mlx-community/Qwen3.6-35B-A3B-8bit \
  --file-path "multi_page.pdf"

# Handle missing fields with null values
./sparrow.sh '[{"required_field":"str", "optional_field":"str or null"}]' \
  --pipeline "sparrow-parse" \
  --options mlx \
  --options mlx-community/Qwen3.6-35B-A3B-8bit \
  --file-path "document.png"

# Table extraction with cropping
./sparrow.sh '*' \
  --pipeline "sparrow-parse" \
  --options mlx \
  --options mlx-community/Qwen3.6-35B-A3B-8bit \
  --options tables_only \
  --crop-size 100 \
  --file-path "scan.pdf"

# Instruction execution
./sparrow.sh "check if business entity Chapman, Kim and Green is invoice issuing party" 
  --pipeline "sparrow-parse" 
  --instruction 
  --options mlx --options lmstudio-community/Mistral-Small-3.2-24B-Instruct-2506-8bit 
  --file-path "invoice_1.jpg"

# Field validation
./sparrow.sh "tax_id,shipment_code,total_gross_worth" 
  --pipeline "sparrow-parse" 
  --validation 
  --options mlx --options lmstudio-community/Mistral-Small-3.2-24B-Instruct-2506-8bit 
  --file-path "invoice_1.jpg"

{
  "tax_id": true,
  "shipment_code": false,
  "total_gross_worth": true
}

๐ŸŒ API Usage

Starting the Server

# Default port (8002)
python api.py

# Custom port
python api.py --port 8001

# Multiple instances
python api.py --port 8002 &  # Sparrow Parse
python api.py --port 8003 &  # Instructor

API Endpoints

Document Extraction (/inference)

curl -X POST 'http://localhost:8002/api/v1/sparrow-llm/inference' \
  -H 'Content-Type: multipart/form-data' \
  -F 'query=[{"field_name":"str", "amount":0}]' \
  -F 'pipeline=sparrow-parse' \
  -F 'options=mlx,mlx-community/Qwen2.5-VL-72B-Instruct-4bit' \
  -F 'file=@document.pdf'

Text Instructions (/instruction-inference)

curl -X POST 'http://localhost:8002/api/v1/sparrow-llm/instruction-inference' \
  -H 'Content-Type: application/x-www-form-urlencoded' \
  -d 'query=instruction: analyze data, payload: {...}' \
  -d 'pipeline=sparrow-instructor' \
  -d 'options=mlx,mlx-community/Qwen3.6-35B-A3B-8bit'

API Documentation

Visit http://localhost:8002/api/v1/sparrow-llm/docs for interactive Swagger documentation.

API Documentation

๐Ÿค– Sparrow Agent

Sparrow Agents

Orchestrate complex document processing workflows with visual monitoring powered by Prefect.

Features

  • Multi-step Workflows: Chain classification, extraction, and validation
  • Visual Monitoring: Real-time pipeline tracking
  • Error Handling: Robust failure recovery
  • Extensible: Custom agents for specific use cases

Usage

# Start agent server
cd sparrow-ml/agents
python api.py --port 8001

# Process medical prescriptions
curl -X POST 'http://localhost:8001/api/v1/sparrow-agents/execute/file' \
  -F 'agent_name=medical_prescriptions' \
  -F 'extraction_params={"sparrow_key":"123456"}' \
  -F 'file=@prescription.pdf'

๐Ÿ“Š Dashboard

Built-in analytics and monitoring dashboard at sparrow.katanaml.io. This is part of Sparrow UI, requires local Oracle Database 23ai Free.

Dashboard

Features

  • Usage Analytics: Track API calls, success rates, performance
  • Geographic Distribution: See usage by country
  • Model Performance: Compare different model performance
  • Real-time Monitoring: Live processing statistics

๐Ÿ”ง Pipeline Comparison

FeatureSparrow ParseSparrow InstructorSparrow Agents
InputDocuments + JSON schemaText instructionsComplex workflows
OutputStructured JSONFree-form textMulti-step results
Use CasesData extraction, formsSummarization, analysisEnterprise workflows
ValidationSchema-basedManualCustom rules
ComplexitySimpleMediumHigh
Best ForInvoices, tables, formsText processingMulti-document flows

When to Use What

Sparrow Parse: Use for structured data extraction from documents
Sparrow Instructor: Use for text analysis, summarization, Q&A
Sparrow Agents: Use for complex multi-step document processing workflows

โšก Performance Tips

Hardware Optimization

Apple Silicon (MLX)

  • โœ… Best performance with unified memory
  • โœ… Models: Mistral Small 3.2 24B, Qwen3.6 27B Dense, Qwen3.6 35B MoE, Gemma 4 31B Dense, Gemma 4 26B MoE
  • โš ๏ธ Requires macOS with Apple Silicon

NVIDIA GPU (vLLM)

  • โœ… Production inference via vLLM backend
  • โœ… Models: Mistral Small 3.2 24B full precision (primary), dots.ocr for large table pipelines
  • โœ… Recommended: 96GB VRAM for full precision models
  • โš ๏ธ Requires CUDA setup

CPU Only

  • โš ๏ธ Significantly slower
  • โœ… Use smaller models (7B parameters max)
  • โœ… Consider Hugging Face cloud backend

Table Extraction

For large or complex tables, use the dots.ocr โ†’ Sparrow Templates pipeline instead of Vision LLM direct extraction:

./sparrow.sh "*" --pipeline "sparrow-parse" \
  --debug --table --table-template "sparrow_generic_table" \
  --options mlx --options mlx-community/Qwen3.6-35B-A3B-8bit \
  --options mlx --options mlx-community/dots.ocr-bf16 --file-path "data/well_report.jpg"
  • dots.ocr: Handles large tables with high accuracy via HTML intermediate output
  • Sparrow Templates: Maps extracted HTML table structure to JSON schema
  • Recommended for financial statements, multi-column invoices, and structured reports

Extraction Hints

Use Sparrow hints to improve accuracy on complex documents โ€” steer model attention to footers and fine print, disambiguate structurally similar fields (e.g., supplier vs. recipient VAT), normalize date and number formats, and resolve priority ordering for ambiguous fields:

./sparrow.sh "[{\"instrument_name\":\"str\", \"valuation\":\"int\"}]" \
  --pipeline "sparrow-parse" --debug --options mlx \
  --options mlx-community/gemma-4-31b-it-8bit \
  --file-path "data/bonds_table.png" --hints-file-path "data/llm_hints_eu.json"

Model Selection

Use CaseRecommended ModelBackendNotes
Invoices / Forms (EU)Mistral Small 3.2 24BvLLM / MLXPrimary production model
Invoices / Forms (US)Gemma 4 31B DenseMLXStrong on English documents
Large Tablesdots.ocrvLLMVia Sparrow Templates pipeline
Quick TestingQwen3.6 27B DenseMLXFast, good general accuracy
Low MemoryQwen3.6 35B MoE / Gemma 4 26B MoEMLXReduced memory footprint

๐Ÿ” Troubleshooting

Common Issues

๐Ÿšซ Installation Problems

Python Version Issues:

# Verify Python version
python --version  # Should be 3.12.10+

# Fix with pyenv
pyenv install 3.12.10
pyenv global 3.12.10

MLX Installation (Apple Silicon):

# If MLX fails to install
pip install --upgrade pip
pip install mlx-vlm --no-cache-dir
# If pip install command throws AttributeError: 'NoneType' object has no attribute 'get'
# POTENTIAL SECURITY RISK - SSL verification is bypassed. Apply if you know what you are doing
pip install mlx-vlm --trusted-host pypi.org --trusted-host pypi.python.org --trusted-host files.pythonhosted.org

Poppler Missing:

# macOS
brew install poppler

# Ubuntu/Debian
sudo apt-get install poppler-utils

# Verify installation
pdftoppm -h
๐Ÿ”ง Runtime Issues

Memory Errors:

  • Use smaller or MoE models to reduce VRAM footprint
  • Enable image cropping: --crop-size 100
  • Process single pages instead of entire PDFs

Model Loading Fails:

# Clear model cache
rm -rf ~/.cache/huggingface/
rm -rf ~/.mlx/

# Redownload models
python -c "from mlx_vlm import load; load('model-name')"

API Connection Issues:

# Check if server is running
curl http://localhost:8002/health

# Check logs
python api.py --debug
๐Ÿ“„ Document Processing Issues

Poor Extraction Quality:

  • Add extraction hints to steer model attention to problem fields
  • Try image cropping: --crop-size 60
  • Use --table --table-template with dots.ocr for table-heavy documents
  • Ensure image resolution is adequate (300+ DPI)
  • Use schema validation: avoid --options validation_off

PDF Processing Fails:

# Test PDF manually
pdftoppm -png input.pdf output

# Check page count
python -c "
import pypdf
with open('file.pdf', 'rb') as f:
    reader = pypdf.PdfReader(f)
    print(f'Pages: {len(reader.pages)}')
"

JSON Schema Errors:

  • Validate JSON syntax: Use jsonlint.com
  • Use proper field types: "str", 0, 0.0, "str or null"
  • Test with simple schema first

Getting Help

  1. ๐Ÿ“– Check Documentation: Review this README and component docs
  2. ๐Ÿ› Search Issues: GitHub Issues
  3. ๐Ÿ’ฌ Create Issue: Provide logs, system info, minimal example
  4. ๐Ÿ“ง Commercial Support: abaranovskis@redsamuraiconsulting.com

โญ Star History

Star History Chart

๐Ÿ“œ License

Open Source: Licensed under GPL 3.0. Free for open source projects and organizations under $5M revenue.

Commercial: Dual licensing available for proprietary use, enterprise features, and dedicated support.

Contact: abaranovskis@redsamuraiconsulting.com for commercial licensing and consulting.

๐Ÿ‘ฅ Authors


โญ Star us on GitHub if Sparrow is useful for your projects!
github.com/katanaml/sparrow