Plasmate Plugin for Dify
April 11, 2026 ยท View on GitHub
Plasmate is an agent-native browser engine that converts HTML to Semantic Object Model (SOM) - a structured JSON representation that is 10-100x smaller than raw HTML while preserving semantic meaning.
This plugin integrates Plasmate with Dify, enabling your LLM applications to browse the web efficiently.
Features
- fetch_page - Fetch any web page as structured semantic content (JSON, text, markdown, or links)
- extract_text - Extract readable text content from a page (ideal for summarization)
- extract_links - Extract all URLs from a page (perfect for crawling and research)
Why Plasmate?
Traditional web scraping returns raw HTML, which is:
- Bloated with markup, styles, and scripts
- Expensive to process with LLMs (high token count)
- Difficult for models to parse accurately
Plasmate's Semantic Object Model (SOM) solves this by:
- Reducing content size by 10-100x
- Preserving semantic structure (navigation, main content, forms)
- Providing element IDs for interaction
- Optimizing output for LLM consumption
Installation
Option 1: Install from Dify Marketplace
- Go to Plugins in your Dify workspace
- Search for "Plasmate"
- Click Install
Option 2: Manual Installation
-
Clone this repository:
git clone https://github.com/plasmate-labs/dify-plasmate.git cd dify-plasmate -
Package the plugin:
dify plugin package ./ -
Upload to Dify:
- Go to Plugins in your Dify workspace
- Click Upload Plugin
- Select the generated
.difypkgfile
Configuration
The plugin supports two modes of operation:
Mode 1: Plasmate Cloud API (Recommended)
- Get an API key from plasmate.app/dashboard
- In Dify, go to Plugins > Plasmate > Settings
- Enter your API key
Mode 2: Local CLI
If you have Plasmate installed locally:
-
Install Plasmate CLI:
# macOS brew install plasmate-labs/tap/plasmate # Or download from https://plasmate.app -
Leave the API key field empty - the plugin will automatically use the local CLI
-
(Optional) Specify a custom CLI path in settings if
plasmateis not in your PATH
Usage
In Dify Workflows
- Add a Tool node to your workflow
- Select Plasmate > Fetch Page (or Extract Text / Extract Links)
- Configure the parameters:
- URL: The web page to fetch
- Format: json, text, markdown, or links
- Selector: Optional filter (main, nav, #element-id)
Example: Research Agent
# Workflow: Research a topic
1. User Input: "Research the latest AI developments"
2. Tool (Plasmate Extract Text):
- URL: https://news.ycombinator.com
- Selector: main
3. LLM: Summarize the top stories
4. Output: Summary to user
Example: Web Crawler
# Workflow: Crawl documentation
1. User Input: "Index the Dify docs"
2. Tool (Plasmate Extract Links):
- URL: https://docs.dify.ai
- Selector: nav
3. Loop: For each link
- Tool (Plasmate Fetch Page)
- Store in knowledge base
Tool Reference
fetch_page
Fetch a web page and convert it to structured content.
| Parameter | Type | Required | Description |
|---|---|---|---|
| url | string | Yes | URL to fetch |
| format | select | No | Output format: json (default), text, markdown, links |
| selector | string | No | Filter to region: main, nav, header, footer, #id |
| timeout | number | No | Timeout in ms (default: 30000) |
extract_text
Extract readable text from a web page.
| Parameter | Type | Required | Description |
|---|---|---|---|
| url | string | Yes | URL to fetch |
| selector | string | No | Filter to region: main, nav, header, footer, #id |
| timeout | number | No | Timeout in ms (default: 30000) |
extract_links
Extract all links from a web page.
| Parameter | Type | Required | Description |
|---|---|---|---|
| url | string | Yes | URL to fetch |
| selector | string | No | Filter to region: main, nav, header, footer, #id |
| timeout | number | No | Timeout in ms (default: 30000) |
Semantic Selectors
Plasmate supports semantic region filtering:
| Selector | Description |
|---|---|
main | Main content area |
nav / navigation | Navigation links |
header | Page header |
footer | Page footer |
aside | Sidebar content |
form | Form elements |
dialog | Modal dialogs |
#element-id | Specific HTML element by ID |
Token Savings
Example token counts for common pages:
| Page | Raw HTML | Plasmate SOM | Savings |
|---|---|---|---|
| News article | ~50,000 | ~3,000 | 94% |
| Documentation | ~30,000 | ~2,000 | 93% |
| E-commerce | ~80,000 | ~5,000 | 94% |
| Search results | ~40,000 | ~2,500 | 94% |
Support
- Documentation: plasmate.app/docs
- Issues: GitHub Issues
- Discord: plasmate.app/discord
License
Apache 2.0 - see LICENSE