FastGPT Plugin System Design

June 27, 2026 · View on GitHub

Introduction

FastGPT-Plugin v1.0.0 systematically refactors the plugin project so plugin installation, version management, runtime isolation, and operations configuration share one model. The main changes are:

Abstract the plugin package protocol and use the unified .pkg format to manage different plugin types, with extension points reserved for future plugin types.
Introduce the local process-pool runtime to isolate plugin execution in separate processes, improving stability, performance, and security boundaries.
Reserve extension points for a Serverless runtime that can later host user-uploaded custom plugins.
Decouple middleware dependencies of the plugin service from the FastGPT main service, reducing deployment and operations coupling.

Terms

Plugin: an independent, reusable capability module that contains specific business logic. Plugins can have different types, such as tools, model presets, and dataset sources.

Plugin package: the packaged .pkg file for a plugin. All plugin types complete installation, updates, and management through plugin packages.

Tool: a plugin type that usually wraps third-party services, internal APIs, or local computation and can be called by workflows and Agents.

Plugin Marketplace: the centralized platform for managing plugins, where users can search, download, and install plugins.

Runtime: the backend implementation responsible for executing plugin code. The current default production runtime is the local process pool. The Connection Gateway debug runtime is used for remote debugging. The Serverless runtime is reserved for future extension.

Pod / worker node: a single plugin child process in the local process pool. One plugin service can own multiple Pods, and each Pod can process concurrent requests according to configuration.

Debug channel: a remote debugging entrypoint enabled by FastGPT for one tmbId. Plugin Server generates a long-lived connectionKey; CLI exchanges it for a short-lived WebSocket connectToken and connects local plugins through Connection Gateway.

Source: plugin source identifier. System plugins normally use regular sources. Remote debugging uses debug:tmbId:{tmbId} to route plugin listing, detail, and invocation to the corresponding Gateway session.

FastGPT Plugin System Architecture

flowchart TB
    subgraph Marketplace["FastGPT Plugin Marketplace"]
        Official["Official Plugins"]
        Community["Community Plugins"]
        Community --> |Curated inclusion| Official
    end

    CommunityUser["Community User"]
    Maintainer["Official Maintainer"]
    Agent["Agent"]

    CommunityUser -->|Provides plugins| Community
    Agent --> |Provides plugins| Community
    Maintainer --> |Reviews and manages| Community
    Maintainer -->|Maintains plugins| Official

    subgraph FastGPT["FastGPT Main Service"]
        Workflow["Workflow / Agent"]
    end

    subgraph Plugin["FastGPT Plugin Service"]
        PluginServer["Plugin Service"]
        DebugOverlay["DebugPluginRepoOverlay"]
        RuntimeRouter["Composite Runtime Manager"]
        subgraph ProcessPool["Local Process Pool"]
            PluginProcess["Plugin Pod"]
            PluginProcess2["Plugin Pod"]
            PluginProcess3["Plugin Pod"]
        end
    end

    subgraph Gateway["Connection Gateway"]
        GatewayHTTP["Internal HTTP API"]
        GatewayWS["WebSocket Debug Channel"]
    end

    LocalCLI["fastgpt-plugin dev"]

    subgraph Serverless["Serverless Runtime"]
        PluginServerless["Plugin Serverless Instance"]
        PluginServerless2["Plugin Serverless Instance"]
        PluginServerless3["Plugin Serverless Instance"]
    end

    subgraph External["External Services"]
        Google["Google Search API"]
        OpenAI["OpenAI API"]
    end

    Workflow -->|Calls plugin| Plugin
    FastGPT -->|Gets plugins| Marketplace

    PluginServer --> DebugOverlay
    PluginServer --> RuntimeRouter
    RuntimeRouter -->|local-pool| ProcessPool
    RuntimeRouter -->|debug source| GatewayHTTP
    LocalCLI -->|WSS bind / metadata / stream| GatewayWS
    GatewayHTTP --> GatewayWS
    RuntimeRouter -->|reserved| Serverless
    PluginProcess --> Google
    PluginProcess2 --> OpenAI

FastGPT-Plugin Service

The FastGPT-Plugin service is responsible for plugin package management, runtime registration, plugin call forwarding, and system-level configuration. The FastGPT main service calls plugins through the plugin runtime interface, and the plugin service dispatches each call to the corresponding runtime.

FastGPT-Plugin supports multiple runtimes:

local-pool: the default production runtime. Plugins run in local child process pools.
connection-gateway-debug: the remote debug runtime. It is active only under debug sources and forwards invocations to local CLI through Connection Gateway.
Serverless: a reserved runtime with extension points in interfaces and data structures.

CompositePluginRuntimeManager selects the runtime. Normal system plugin invocations enter local-pool. Invocations with debug source context enter ConnectionGatewayDebugRuntimeManager. If the debug path is disconnected or the session does not exist, the call fails directly and does not fall back to the production runtime.

System plugins can be installed in two ways:

System-level installation: an administrator uploads a plugin or installs it from the Plugin Marketplace on the plugin management page. The plugin is visible to the whole system.
Team-level installation, not yet implemented: a team administrator or a member with plugin management permission uploads a plugin, and the plugin is visible only to that team.

After a plugin is installed, the service saves the plugin package file, parses plugin metadata, and registers the plugin with the runtime when it is enabled. Runtime configuration is saved per plugin. When no configuration record exists, defaults from environment variables are used.

Remote Debug Design

Remote debugging lets developers run plugins locally and temporarily attach them to a FastGPT test environment. It is a debug connection layer, not a production plugin runtime.

Current flow:

FastGPT authenticates the current user and resolves tmbId.
FastGPT calls Plugin Server POST /plugin/debug-sessions to enable the debug channel.
Plugin Server creates source debug:tmbId:{tmbId} and generates a long-lived connectionKey. The server persists only connectionKeyHash; plaintext key is returned only on creation or refresh.
Local CLI runs fastgpt-plugin dev and calls POST /plugin/debug-sessions/connection-key:exchange with the connectionKey.
Plugin Server validates the connection key, signs a short-lived WebSocket connectToken, and returns gatewayUrl, source, connectToken, and expiresAt.
CLI connects to Connection Gateway and sends bind plus local plugin metadata.
Plugin Server reads metadata through Gateway status and temporarily merges local plugins into plugin and tool lists under the debug source.
When FastGPT invokes a debug plugin, Plugin Server publishes a request envelope through Gateway internal API. CLI executes the local plugin and streams results back.

sequenceDiagram
    participant FastGPT as FastGPT
    participant Plugin as Plugin Server
    participant Gateway as Connection Gateway
    participant CLI as fastgpt-plugin dev

    FastGPT->>Plugin: Enable debug channel with tmbId
    Plugin-->>FastGPT: source, keyId, connectionKey
    CLI->>Plugin: Exchange connectionKey
    Plugin-->>CLI: gatewayUrl, source, connectToken
    CLI->>Gateway: WSS bind + metadata
    FastGPT->>Plugin: List / invoke debug plugin
    Plugin->>Gateway: status / request stream
    Gateway->>CLI: request envelope
    CLI-->>Gateway: response stream
    Gateway-->>Plugin: NDJSON stream
    Plugin-->>FastGPT: tool stream

Debug APIs:

API	Description
`POST /plugin/debug-sessions`	Enable a `tmbId`-scoped debug channel.
`POST /plugin/debug-sessions/key:refresh`	Refresh the long-lived connection key and close the old Gateway session.
`POST /plugin/debug-sessions/connection-key:exchange`	Exchange the long-lived key for short-lived WebSocket connection info.
`GET /plugin/debug-sessions/:tmbId`	Get debug channel status and mounted local plugins.
`POST /plugin/debug-sessions/:tmbId/revoke`	Close the current debug session; key semantics are controlled by server-side state.

Debug source plugin queries are handled by DebugPluginRepoOverlay. It splits requested sources into debug sources and regular sources. Debug sources read local plugin metadata from Gateway session metadata; regular sources continue to read from the persistent plugin repository; results are merged afterwards.

Debug invocation is handled by ConnectionGatewayDebugRuntimeManager. It requires a connected Gateway session with ownerAlive=true, then sends a plugin-debug.run envelope. CLI receives only a short-lived connect token and does not need CONNECTION_GATEWAY_AUTH_TOKEN or JWT_SECRET.

For Connection Gateway long-connection protocol, sessions, mailbox, owner leases, and limits, see Connection Gateway Design.

System-Level Plugin Management

System administrators, or root users, can manage system-level plugin status, secrets, and runtime parameters.

Plugin Status Configuration

Plugins can have three statuses:

Normal: the plugin is available for normal use.
Pending offline: the plugin is marked for offline. Existing workflows continue to run, but the plugin can no longer be added to new workflows.
Offline: the plugin cannot be used.

System Secret Configuration

System-level plugins can configure "system secrets" that other users in the system can reuse when invoking the plugin. Secrets are hosted by the plugin service. Callers reference them through plugin configuration and never access plaintext secrets directly.

Local Process Pool Parameters

Each tool plugin can configure four runtime parameters.

config

Minimum worker nodes: default 0. If set above 0, worker nodes are warmed up when the plugin is registered or its configuration is updated, and the service tries to maintain at least this many Pods. This fits high-IO and cold-start-sensitive plugins.
Maximum worker nodes: default 5. When no Pod is available and the current Pod count has not reached the limit, the scheduler scales out. CPU-heavy plugins can raise this value to use multiple cores, while also considering host memory and POOL_MAX_TOTAL_PODS.
Node timeout: default 120000ms. This is the execution timeout for one plugin call inside a Pod and can be raised for long-running plugins.
Maximum concurrent requests per node: default 10. This is the maximum number of concurrent requests one Pod can process. High-IO and low-CPU plugins can raise it; CPU-heavy plugins should keep it lower.

Local Process Pool Scheduling

local-pool manages Pods and request queues per plugin service. After a call enters a service, scheduling proceeds as follows:

Prefer an existing available Pod and dispatch the request immediately.
If no Pod is available and pods + pendingPods < maxPods, create a new Pod first and dispatch the current request after startup succeeds.
If maxPods has been reached, startup backoff is active, or a Pod cannot be created temporarily, the request enters a bounded queue.
When a Pod is released, startup succeeds, configuration is updated, or a crash is recovered, the queue continues to drain.
When queue length reaches maxQueueSize, new requests are rejected. Requests also fail after waiting longer than queueTimeout.

The queue is the backpressure mechanism after capacity is exhausted. Scale-out does not depend on the queue being full. pendingPods count toward capacity to prevent concurrent cold starts from exceeding maxPods.

Environment Variables

Environment variables provide default runtime parameters and global limits:

Environment variable	Description
`POOL_HEALTH_CHECK_INTERVAL`	Health check interval in milliseconds. The process pool checks registered plugin services at this interval.
`POOL_MAX_TOTAL_PODS`	Total limit for all plugin Pods in the current server process. This quota is checked during plugin registration and configuration updates.
`POOL_SERVICE_MIN_PODS`	Default minimum worker nodes for one plugin.
`POOL_SERVICE_MAX_PODS`	Default maximum worker nodes for one plugin.
`POOL_SERVICE_IDLE_TIMEOUT`	Pod idle recycle time in milliseconds.
`POOL_SERVICE_POD_TIMEOUT`	Execution timeout for one plugin call in milliseconds.
`POOL_SERVICE_MAX_CONCURRENT_REQUESTS_PER_POD`	Default maximum concurrent requests for one Pod.
`POOL_SERVICE_MAX_REQUESTS_PER_POD`	Maximum requests one Pod can process before replacement; this reduces memory leak risk from long-running processes.
`POOL_SERVICE_MAX_QUEUE_SIZE`	Maximum request queue capacity for one plugin service. New requests are rejected after this limit.
`POOL_SERVICE_QUEUE_TIMEOUT`	Maximum time a request can wait in queue for an available Pod, in milliseconds.
`POOL_SERVICE_STARTUP_RETRY_BASE_DELAY`	Base delay for exponential backoff after Pod startup timeout, in milliseconds.
`POOL_SERVICE_STARTUP_RETRY_MAX_DELAY`	Maximum delay for exponential backoff after Pod startup timeout, in milliseconds.
`CONNECTION_GATEWAY_BASE_URL`	Gateway internal HTTP base URL for Plugin Server.
`CONNECTION_GATEWAY_PUBLIC_URL`	WebSocket URL returned to CLI.
`CONNECTION_GATEWAY_AUTH_TOKEN`	Bearer token used by Plugin Server for Gateway internal API.
`CONNECTION_GATEWAY_DEBUG_REQUEST_TIMEOUT_MS`	Timeout while waiting for CLI responses during remote debug invocation.

Pod startup errors are recorded and classified. Consecutive non-timeout startup failures trigger startup circuit breaking after the threshold is reached, preventing more Pods from being created. Startup timeouts are treated as resource pressure, enter exponential backoff, and retry later. For detailed scheduling, recycling, and metrics design, see Process Pool Design.