Plasmate Components for Langflow

April 12, 2026 ยท View on GitHub

Token-efficient web browsing components for Langflow. Plasmate converts HTML to Semantic Object Model (SOM) with 10-100x token compression, making web content more accessible and cost-effective for LLM processing.

What It Does

Plasmate is a browser engine designed for AI agents. Instead of sending raw HTML to your LLM (which wastes tokens on navigation, scripts, and boilerplate), Plasmate extracts the semantic content and structure.

Before (raw HTML): ~50,000 tokens for a typical web page After (Plasmate SOM): ~500-5,000 tokens for the same content

Components

Plasmate Web Browser

Full-featured web browser component with multiple output formats:

  • SOM - Structured JSON optimized for AI comprehension
  • Text - Clean, readable text extraction
  • Markdown - Formatted markdown output

Inputs:

  • url (required) - The URL to fetch
  • output_format - som, text, or markdown
  • selector - Optional CSS selector for specific content
  • timeout - Request timeout in seconds

Plasmate Text Extractor

Simplified component for extracting clean text from any URL. Perfect for:

  • RAG pipelines
  • Document processing
  • Content summarization

Extract all links from a web page with text and URLs. Ideal for:

  • Web crawling flows
  • Research pipelines
  • Site mapping

Installation

Option 1: Copy to Custom Components Folder

  1. Locate your Langflow custom components directory:

    # Default location
    ~/.langflow/components/
    
  2. Copy all files to the components directory:

    cp -r langflow-plasmate/* ~/.langflow/components/plasmate/
    
  3. Restart Langflow

Option 2: Use LANGFLOW_COMPONENTS_PATH

  1. Set the environment variable:

    export LANGFLOW_COMPONENTS_PATH=/path/to/langflow-plasmate
    
  2. Start Langflow:

    langflow run
    

Install Plasmate

The components require the Plasmate CLI to be installed:

# macOS
brew install plasmate/tap/plasmate

# Linux
curl -fsSL https://plasmate.app/install.sh | sh

# Or build from source
git clone https://github.com/aspect-build/plasmate
cd plasmate && cargo build --release

Screenshots

[Screenshot: Plasmate Web Browser component in Langflow]

[Screenshot: Example flow with Plasmate + LLM]

Example Flows

See the example_flows/ directory for ready-to-use Langflow flows:

Web Research Flow

example_flows/web_research_flow.json

A flow that:

  1. Takes a URL as input
  2. Extracts content with Plasmate
  3. Sends to LLM for summarization
  4. Returns structured insights

RAG Pipeline Flow

example_flows/rag_pipeline_flow.json

A RAG flow that:

  1. Fetches multiple URLs with Plasmate
  2. Chunks the content
  3. Stores in a vector database
  4. Enables semantic search

Configuration Options

Plasmate Binary Path

If Plasmate is not in your PATH, configure the binary location:

  1. In the component, expand "Advanced" options
  2. Set "Plasmate Binary Path" to the full path (e.g., /usr/local/bin/plasmate)

Timeout

Adjust timeout for slow-loading pages in the "Advanced" options (default: 30 seconds).

CSS Selectors

Use CSS selectors to extract specific content:

  • main - Main content area
  • article - Article content
  • .content - Elements with class "content"
  • #main-text - Element with ID "main-text"

Troubleshooting

"Plasmate binary not found"

Ensure Plasmate is installed and in your PATH:

which plasmate
plasmate --version

Or set the full path in the component's advanced options.

"Timeout" errors

Increase the timeout value in advanced options, or check if the URL is accessible:

plasmate fetch https://example.com

Empty output

Some sites block automated requests. Try:

  • A different URL
  • Adding custom headers (if supported)
  • Using Plasmate Cloud API instead of CLI

License

MIT