Activepieces Plasmate Piece

April 11, 2026 · View on GitHub

An Activepieces piece for Plasmate - the browser engine for AI agents. Convert HTML to Semantic Object Model (SOM) with 16x fewer tokens than raw HTML.

Features

Fetch Page: Convert any web page to structured SOM JSON
Extract Text: Get clean, readable text from web pages
Extract Links: Extract and categorize all links from a page

Installation

From Activepieces Marketplace

Go to your Activepieces instance
Navigate to Settings > Pieces
Search for "Plasmate"
Click Install

Manual Installation (Self-Hosted)

Clone this repository into your Activepieces pieces directory:

cd /path/to/activepieces/packages/pieces/community
git clone https://github.com/plasmate-labs/activepieces-plasmate piece-plasmate

Install dependencies and build:

cd piece-plasmate
npm install
npm run build

Add the piece to your Activepieces configuration.

Development

# Install dependencies
npm install

# Build
npm run build

# Watch mode for development
npm run dev

Configuration

Authentication

The Plasmate piece supports two modes:

Plasmate Cloud (Recommended): Get an API key from plasmate.app/dashboard
Local CLI: Leave the API key empty to use a locally installed Plasmate CLI (self-hosted Activepieces only)

Actions

Fetch Page

Fetch a web page and convert it to Semantic Object Model (SOM).

Inputs:

URL (required): The URL to fetch
Output Format: SOM, Plain Text, or JSON
CSS Selector (optional): Extract a specific portion of the page

Output:

{
  "success": true,
  "url": "https://example.com",
  "format": "som",
  "data": {
    "regions": {
      "main": { ... },
      "navigation": { ... }
    },
    "elements": [ ... ]
  }
}

Extract Text

Extract clean, readable text from a web page.

Inputs:

URL (required): The URL to extract text from
CSS Selector (optional): Extract text from a specific portion

Output:

{
  "success": true,
  "url": "https://example.com",
  "text": "The extracted text content...",
  "lines": ["Line 1", "Line 2"],
  "stats": {
    "lineCount": 42,
    "wordCount": 350,
    "charCount": 2100
  }
}

Extract Links

Extract all links from a web page with filtering options.

Inputs:

URL (required): The URL to extract links from
CSS Selector (optional): Extract links from a specific portion
URL Filter Pattern (optional): Regex pattern to filter links
Include External Links: Toggle external link inclusion
Unique Links Only: Remove duplicate URLs

Output:

{
  "success": true,
  "url": "https://example.com",
  "links": [
    { "text": "About Us", "href": "https://example.com/about" },
    { "text": "Contact", "href": "https://example.com/contact" }
  ],
  "stats": {
    "total": 25,
    "internal": 20,
    "external": 5
  },
  "categorized": {
    "internal": [ ... ],
    "external": [ ... ]
  }
}

Use Cases

Content Monitoring: Track changes on web pages
Data Extraction: Extract structured data from websites
SEO Analysis: Analyze page content and link structure
Research Automation: Gather information from multiple sources
AI Workflows: Feed web content to AI models with minimal tokens

Why Plasmate?

16x fewer tokens than raw HTML on average
50x faster than headless browser solutions
30MB memory footprint vs 300MB+ for Chrome
Structured output with semantic roles, not tag soup

License

MIT License - see LICENSE for details.