Firecrawl-Plasmate

April 12, 2026 ยท View on GitHub

Use Plasmate as a high-performance backend for Firecrawl-compatible web scraping with 10-100x token compression.

Plasmate converts HTML to a Semantic Object Model (SOM) - structured JSON that captures the meaningful content while stripping away the noise. This dramatically reduces token usage when feeding web content to LLMs.

Features

  • Drop-in Firecrawl replacement: Same API (scrape_url, crawl_url, map_url)
  • 10-100x token savings: SOM compression vs raw HTML
  • Structured output: JSON that's easy for LLMs to parse
  • MCP support: Works with Claude Code, Cursor, and other MCP clients
  • No API key required: Runs locally using the Plasmate binary

Installation

# Install the Python package
pip install firecrawl-plasmate

# Install Plasmate binary (required)
cargo install plasmate
# Or download from: https://github.com/nickarino/plasmate/releases

Quick Start

from firecrawl_plasmate import PlasmateApp

# Initialize (no API key needed!)
app = PlasmateApp()

# Scrape a URL
response = app.scrape_url("https://example.com")

print(response.markdown)  # Clean markdown
print(response.som)       # Structured JSON (SOM)
print(f"Token savings: {response.token_savings:.1%}")

Migration from Firecrawl

Switching from Firecrawl is a one-line change:

# Before (Firecrawl)
from firecrawl import FirecrawlApp
app = FirecrawlApp(api_key="fc-...")

# After (Plasmate)
from firecrawl_plasmate import PlasmateApp
app = PlasmateApp()  # No API key needed!

All existing code continues to work:

# These all work the same way
response = app.scrape_url("https://docs.python.org")
crawl = app.crawl_url("https://example.com", max_depth=2)
urls = app.map_url("https://example.com")

API Comparison

FeatureFirecrawlPlasmate
scrape_url()Returns markdown, HTMLReturns markdown, HTML, SOM
crawl_url()Async with webhooksSync with concurrent workers
map_url()Sitemap discoverySitemap + crawl discovery
Token usage~1000 tokens/page~50-100 tokens/page
Requires API keyYesNo
Runs locallyNoYes

SOM (Semantic Object Model)

The SOM is Plasmate's structured JSON output that captures the semantic meaning of a page:

{
  "type": "document",
  "title": "Python Documentation",
  "children": [
    {
      "type": "heading",
      "level": 1,
      "content": "Welcome to Python"
    },
    {
      "type": "paragraph",
      "content": "Python is a programming language..."
    },
    {
      "type": "link",
      "content": "Download Python",
      "href": "/downloads/"
    }
  ]
}

Compared to raw HTML:

  • 10-100x fewer tokens
  • Semantic structure (headings, paragraphs, links)
  • No boilerplate (scripts, styles, ads)

Response Object

response = app.scrape_url("https://example.com")

# Firecrawl-compatible fields
response.success      # bool
response.url          # str
response.markdown     # str
response.html         # str
response.links        # list[str]
response.metadata     # dict

# Plasmate-specific fields
response.som              # dict - Semantic Object Model
response.token_savings    # float - e.g., 0.95 = 95% savings
response.raw_html_tokens  # int - estimated HTML tokens
response.som_tokens       # int - actual SOM tokens

Crawling

# Crawl with depth limit
crawl = app.crawl_url(
    "https://docs.python.org",
    max_depth=2,
    limit=50,
    include_patterns=[r"/library/.*"],
    exclude_patterns=[r"/3\.\d+/"],
)

print(f"Crawled {crawl.completed} pages")
print(f"Total token savings: {crawl.total_token_savings:.1%}")

for page in crawl.pages:
    print(f"  {page.url}: {page.som_tokens} tokens")

URL Mapping

# Discover all URLs on a site
urls = app.map_url(
    "https://example.com",
    search="blog",  # Filter URLs containing "blog"
    limit=1000,
)

print(f"Found {len(urls.links)} URLs")
if urls.sitemap_url:
    print(f"Sitemap: {urls.sitemap_url}")

MCP Server

Use Plasmate with Claude Code, Cursor, or other MCP clients:

# Run the MCP server
firecrawl-plasmate-mcp

# Or with Python
python -m firecrawl_plasmate.mcp

Add to your MCP config (e.g., ~/.claude/settings.json):

{
  "mcpServers": {
    "firecrawl-plasmate": {
      "command": "firecrawl-plasmate-mcp"
    }
  }
}

MCP Tools

  • scrape: Scrape a single URL
  • crawl: Crawl a website with depth limit
  • map: Discover URLs on a website

Token Savings Examples

SiteHTML TokensSOM TokensSavings
Wikipedia article45,0002,10095%
Product page12,00045096%
Documentation8,00062092%
Blog post5,00038092%

Configuration

# Custom Plasmate binary path
app = PlasmateApp(plasmate_path="/usr/local/bin/plasmate")

# With custom headers
response = app.scrape_url(
    "https://example.com",
    headers={"Authorization": "Bearer token123"}
)

Async Support

import asyncio
from firecrawl_plasmate.crawler import crawl_async

async def main():
    app = PlasmateApp()
    result = await crawl_async(
        app,
        "https://example.com",
        max_depth=2,
        max_concurrent=10,
    )
    print(f"Crawled {result.completed} pages")

asyncio.run(main())

Requirements

  • Python 3.10+
  • Plasmate binary (Rust)
    • Install: cargo install plasmate
    • Or download from releases

License

MIT