README.md

April 1, 2026 · View on GitHub

Plasmate

n8n-nodes-plasmate

n8n community node for Plasmate — fetch web pages and get structured Semantic Object Model (SOM) content instead of raw HTML.

CI n8n community node License


What it does

The Plasmate node fetches any URL using Plasmate — a fast headless browser engine — and returns structured data instead of raw HTML. Plasmate compiles pages into a Semantic Object Model (SOM): organized regions, interactive elements with stable IDs, extracted text, and structured data (JSON-LD, OpenGraph).

Why not just use the HTTP Request node?

The HTTP Request node returns raw HTML — tens of thousands of tokens that downstream AI nodes have to parse. Plasmate returns structured JSON that's 10-800x smaller and immediately usable.

Operations

OperationOutput
Fetch PageFull SOM: title, regions, elements, metadata
Extract TextPlain text joined from all page regions
Extract LinksArray of {text, href, region} objects
Extract Structured DataJSON-LD, OpenGraph, and microdata

Prerequisites

  1. A self-hosted n8n instance (community nodes require self-hosted n8n)
  2. Plasmate installed on the same machine as n8n:
curl -fsSL https://plasmate.app/install.sh | sh

Installation

In your n8n instance, go to Settings → Community Nodes → Install and enter:

n8n-nodes-plasmate

Or install via npm in your n8n directory:

npm install n8n-nodes-plasmate

Usage

Basic — Fetch a page

  1. Add a Plasmate node to your workflow
  2. Set Operation to "Fetch Page"
  3. Set URL to any web address
  4. Connect downstream nodes to work with the SOM output

Set Operation to "Extract Links". The output includes links (an array) and link_count. Use the Split Out node to process each link individually in downstream steps.

Authenticated browsing

Set Auth Profile in Options to the domain (e.g. github.com). Requires cookies to be stored via the Plasmate browser extension beforehand.

Batch processing

Connect multiple URLs from an upstream node (e.g. a list from a Google Sheet or database). The Plasmate node processes one URL per input item.

Options

OptionDefaultDescription
Auth Profile(none)Domain for authenticated browsing (e.g. github.com)
Plasmate Binary PathplasmateOverride if plasmate is not in PATH
Timeout (Seconds)30Max seconds to wait for a page fetch

Example output — Fetch Page

{
  "url": "https://example.com",
  "title": "Example Domain",
  "lang": "en",
  "element_count": 4,
  "interactive_count": 1,
  "region_count": 1,
  "som": {
    "regions": [
      {
        "id": "main",
        "role": "main",
        "elements": [
          { "id": "e1", "role": "heading", "text": "Example Domain" },
          { "id": "e2", "role": "text", "text": "This domain is for use in illustrative examples." },
          { "id": "e3", "role": "link", "text": "More information...", "href": "https://www.iana.org/domains/example" }
        ]
      }
    ]
  }
}

Token savings

Real-world benchmark (SOM vs raw HTML):

SiteSavings
Vercel docs99.6%
Stripe API95.8%
Next.js docs92.3%
Stack Overflow85.6%
Wikipedia82.8%

License

MIT