servo-fetch
June 14, 2026 · View on GitHub
Python bindings for servo-fetch — fetch, render, and extract web content with an embedded Servo browser engine.
- No Chromium — single binary, no browser download
- JavaScript execution — full Servo engine with SpiderMonkey
- Schema extraction — declarative CSS-selector → structured JSON, no LLM
- Async-ready —
asyncio.to_threadwrappers,AsyncClientwith streaming crawl - Typed — full
.pyistubs, works with ty / mypy / pyright
Install
pip install servo-fetch
Quick Start
import servo_fetch
page = servo_fetch.fetch("https://example.com")
page.html # rendered HTML
page.inner_text # document.body.innerText
page.markdown # readable Markdown (lazy, cached)
page.title # str | None
Schema Extraction
from servo_fetch import Schema, Field
schema = Schema(
base_selector=".product",
fields=[
Field(name="title", selector="h2", type="text"),
Field(name="price", selector=".price", type="text"),
Field(name="url", selector="a", type="attribute", attribute="href"),
],
)
page = servo_fetch.fetch("https://shop.example.com", schema=schema)
page.extracted # [{"title": "...", "price": "...", "url": "..."}]
Session Cookies
To fetch authenticated pages, pass a str or os.PathLike path to a
Netscape-format cookies.txt via cookies_file:
import servo_fetch
page = servo_fetch.fetch("https://app.example.com/dashboard", cookies_file="cookies.txt")
# Also accepted by Client.fetch / Client.crawl and their async equivalents.
client = servo_fetch.Client(user_agent="MyBot/1.0")
pages = client.crawl("https://app.example.com", cookies_file="cookies.txt", max_pages=20)
A missing or malformed file raises servo_fetch.CookieError. Cookies are scoped
to the target's site, so out-of-scope entries in the file are ignored.
Custom Headers
Pass a dict[str, str] via headers to add request headers (e.g. API tokens).
Accepted by fetch, Client.fetch / Client.crawl / Client.map, and their
async equivalents:
import servo_fetch
page = servo_fetch.fetch("https://api.example.com", headers={"Authorization": "Bearer TOKEN"})
Framing headers (Host, Content-Length, …) are rejected, and User-Agent /
Cookie have dedicated options; invalid headers raise ValueError.
Async
from servo_fetch import fetch_async, AsyncClient
page = await fetch_async("https://example.com")
async with AsyncClient(user_agent="MyBot/1.0") as client:
async for page in client.crawl_stream("https://docs.example.com", max_pages=50):
print(page.url, page.title)
Develop
Requires uv.
uv sync --group all # create venv + install dev deps
uv run maturin develop # build extension (debug, fast compile)
uv run pytest # run tests
uv run ruff check python tests # lint
uv run ty check python # type check
Troubleshooting
Linux: "cannot allocate memory in static TLS block"
Servo's native extension uses large thread-local storage. On some Linux systems, set before importing:
export GLIBC_TUNABLES=glibc.rtld.optional_static_tls=16384