flowise-plasmate

April 12, 2026 · View on GitHub

Flowise custom nodes for Plasmate - the AI browser engine that converts HTML to structured JSON with 10-100x token compression.

Perfect for building LLM workflows that need to process web content efficiently.

Installation

Prerequisites

Install Plasmate:

# macOS
brew install plasmate/tap/plasmate

# Or build from source
cargo install plasmate

Verify installation:
```
plasmate --version
```

Installing in Flowise

Navigate to your Flowise installation's custom nodes directory:

cd ~/.flowise/custom-nodes
# Or for Docker: copy to mounted volume

Clone or copy this package:

git clone https://github.com/user/flowise-plasmate.git
cd flowise-plasmate
npm install
npm run build

Restart Flowise to load the new nodes.

Nodes

Plasmate Web Browser

The main node for fetching and parsing web content.

Inputs:

URL (required): The webpage URL to fetch
Format: Output format
- SOM - Semantic Object Model (structured JSON, maximum compression)
- Text - Clean readable text extraction
- Markdown - Markdown formatted output
CSS Selector (optional): Focus extraction on specific elements
Timeout: Request timeout in seconds (default: 30)
Plasmate Path: Path to plasmate binary (default: plasmate in PATH)
Custom Headers: HTTP headers for authenticated requests

Outputs:

Document: LangChain-compatible Document with metadata
Text: Raw string output

Use Cases:

Load documentation pages for RAG
Fetch article content for summarization
Parse structured data from websites

Plasmate Extract Links

Extract all links from a webpage with filtering options.

Inputs:

URL (required): The webpage URL
Filter Type: All Links, Internal Only, or External Only
CSS Selector (optional): Limit extraction to specific page areas
Timeout: Request timeout in seconds

Outputs:

Links Array: Array of { href, text, type } objects
URLs Only: Array of URL strings

Use Cases:

Build web crawlers
Extract navigation structures
Find related content links

Plasmate Text Extract

Simplified node for clean text extraction.

Inputs:

URL (required): The webpage URL
CSS Selector (optional): Focus on specific elements
Include Metadata: Add page title and description
Max Length: Truncate output to character limit
Timeout: Request timeout in seconds

Outputs:

Text: Clean text string
Document: LangChain Document with metadata

Use Cases:

Quick text extraction for chat
Article content for RAG pipelines
Clean text for embeddings

Example Chatflows

Basic Web Q&A

[Plasmate Web Browser] --> [Recursive Character Text Splitter] --> [OpenAI Embeddings] --> [In-Memory Vector Store] --> [Conversational Retrieval QA Chain]

Web Research Agent

[Plasmate Extract Links] --> [Loop/Iterator] --> [Plasmate Text Extract] --> [Document Aggregator] --> [LLM Chain]

Documentation Loader

[Plasmate Web Browser (SOM format)] --> [Custom Transform] --> [Vector Store]

Configuration

Custom Plasmate Path

If Plasmate is not in your PATH, specify the full path in the node's advanced settings:

/usr/local/bin/plasmate
# or
/home/user/.cargo/bin/plasmate

Authenticated Requests

For sites requiring authentication, use Custom Headers:

Authorization: Bearer your-token-here
Cookie: session=abc123

Token Compression Comparison

Source	Raw HTML	Plasmate SOM	Reduction
News article	50,000 tokens	3,000 tokens	94%
Documentation	30,000 tokens	2,500 tokens	92%
E-commerce page	80,000 tokens	5,000 tokens	94%

Troubleshooting

"Plasmate not found"

Ensure Plasmate is installed and in your PATH:

which plasmate
plasmate --version

Or specify the full path in node settings.

Timeout Errors

Increase the timeout value for slow-loading pages, or use a CSS selector to fetch only needed content.

Empty Output

Some sites may block automated requests. Try adding a User-Agent header:

User-Agent: Mozilla/5.0 (compatible; Flowise/1.0)

License

MIT