Firecrawl-Plasmate
April 12, 2026 ยท View on GitHub
Use Plasmate as a high-performance backend for Firecrawl-compatible web scraping with 10-100x token compression.
Plasmate converts HTML to a Semantic Object Model (SOM) - structured JSON that captures the meaningful content while stripping away the noise. This dramatically reduces token usage when feeding web content to LLMs.
Features
- Drop-in Firecrawl replacement: Same API (
scrape_url,crawl_url,map_url) - 10-100x token savings: SOM compression vs raw HTML
- Structured output: JSON that's easy for LLMs to parse
- MCP support: Works with Claude Code, Cursor, and other MCP clients
- No API key required: Runs locally using the Plasmate binary
Installation
# Install the Python package
pip install firecrawl-plasmate
# Install Plasmate binary (required)
cargo install plasmate
# Or download from: https://github.com/nickarino/plasmate/releases
Quick Start
from firecrawl_plasmate import PlasmateApp
# Initialize (no API key needed!)
app = PlasmateApp()
# Scrape a URL
response = app.scrape_url("https://example.com")
print(response.markdown) # Clean markdown
print(response.som) # Structured JSON (SOM)
print(f"Token savings: {response.token_savings:.1%}")
Migration from Firecrawl
Switching from Firecrawl is a one-line change:
# Before (Firecrawl)
from firecrawl import FirecrawlApp
app = FirecrawlApp(api_key="fc-...")
# After (Plasmate)
from firecrawl_plasmate import PlasmateApp
app = PlasmateApp() # No API key needed!
All existing code continues to work:
# These all work the same way
response = app.scrape_url("https://docs.python.org")
crawl = app.crawl_url("https://example.com", max_depth=2)
urls = app.map_url("https://example.com")
API Comparison
| Feature | Firecrawl | Plasmate |
|---|---|---|
scrape_url() | Returns markdown, HTML | Returns markdown, HTML, SOM |
crawl_url() | Async with webhooks | Sync with concurrent workers |
map_url() | Sitemap discovery | Sitemap + crawl discovery |
| Token usage | ~1000 tokens/page | ~50-100 tokens/page |
| Requires API key | Yes | No |
| Runs locally | No | Yes |
SOM (Semantic Object Model)
The SOM is Plasmate's structured JSON output that captures the semantic meaning of a page:
{
"type": "document",
"title": "Python Documentation",
"children": [
{
"type": "heading",
"level": 1,
"content": "Welcome to Python"
},
{
"type": "paragraph",
"content": "Python is a programming language..."
},
{
"type": "link",
"content": "Download Python",
"href": "/downloads/"
}
]
}
Compared to raw HTML:
- 10-100x fewer tokens
- Semantic structure (headings, paragraphs, links)
- No boilerplate (scripts, styles, ads)
Response Object
response = app.scrape_url("https://example.com")
# Firecrawl-compatible fields
response.success # bool
response.url # str
response.markdown # str
response.html # str
response.links # list[str]
response.metadata # dict
# Plasmate-specific fields
response.som # dict - Semantic Object Model
response.token_savings # float - e.g., 0.95 = 95% savings
response.raw_html_tokens # int - estimated HTML tokens
response.som_tokens # int - actual SOM tokens
Crawling
# Crawl with depth limit
crawl = app.crawl_url(
"https://docs.python.org",
max_depth=2,
limit=50,
include_patterns=[r"/library/.*"],
exclude_patterns=[r"/3\.\d+/"],
)
print(f"Crawled {crawl.completed} pages")
print(f"Total token savings: {crawl.total_token_savings:.1%}")
for page in crawl.pages:
print(f" {page.url}: {page.som_tokens} tokens")
URL Mapping
# Discover all URLs on a site
urls = app.map_url(
"https://example.com",
search="blog", # Filter URLs containing "blog"
limit=1000,
)
print(f"Found {len(urls.links)} URLs")
if urls.sitemap_url:
print(f"Sitemap: {urls.sitemap_url}")
MCP Server
Use Plasmate with Claude Code, Cursor, or other MCP clients:
# Run the MCP server
firecrawl-plasmate-mcp
# Or with Python
python -m firecrawl_plasmate.mcp
Add to your MCP config (e.g., ~/.claude/settings.json):
{
"mcpServers": {
"firecrawl-plasmate": {
"command": "firecrawl-plasmate-mcp"
}
}
}
MCP Tools
- scrape: Scrape a single URL
- crawl: Crawl a website with depth limit
- map: Discover URLs on a website
Token Savings Examples
| Site | HTML Tokens | SOM Tokens | Savings |
|---|---|---|---|
| Wikipedia article | 45,000 | 2,100 | 95% |
| Product page | 12,000 | 450 | 96% |
| Documentation | 8,000 | 620 | 92% |
| Blog post | 5,000 | 380 | 92% |
Configuration
# Custom Plasmate binary path
app = PlasmateApp(plasmate_path="/usr/local/bin/plasmate")
# With custom headers
response = app.scrape_url(
"https://example.com",
headers={"Authorization": "Bearer token123"}
)
Async Support
import asyncio
from firecrawl_plasmate.crawler import crawl_async
async def main():
app = PlasmateApp()
result = await crawl_async(
app,
"https://example.com",
max_depth=2,
max_concurrent=10,
)
print(f"Crawled {result.completed} pages")
asyncio.run(main())
Requirements
- Python 3.10+
- Plasmate binary (Rust)
- Install:
cargo install plasmate - Or download from releases
- Install:
License
MIT