README.md

May 20, 2026 ยท View on GitHub

Octopus Logo

Octopus

Desktop AI Agent Framework ยท Multi-tentacle Collaboration

Version License Platform

React Python Electron FastAPI


๐ŸŒŸ Like an octopus, handle multiple things at once ๐ŸŒŸ

๐ŸŽฌ Demo Videos

https://github.com/user-attachments/assets/29a64b38-3f98-4cbc-99d4-662f55cbec74

Text-to-Speech Demo

https://github.com/user-attachments/assets/ef0af274-e988-436f-a7da-a007e1a814ee

WeChat Channel Demo

https://github.com/user-attachments/assets/1de4e3d3-3397-46f8-a6b5-8f9dfef2b580


โœจ Core Features

๐Ÿš€ One-Click Deploy No server, no YAML

โšก Double-click to install ๐Ÿ Embedded Python env ๐Ÿ’พ Portable USB mode ๐Ÿ”’ Data stays local

๐Ÿ’ฐ Cost Transparency Know what you spend

๐Ÿ“Š Real-time token counter ๐Ÿ“ˆ Visual cost charts โš ๏ธ Budget alerts ๐Ÿ”„ Model cost compare

๐Ÿงฉ Markdown Skills Extend without coding

๐Ÿ“ Write SKILL.md ๐Ÿ”— MCP protocol support ๐Ÿ“ฆ Git install extensions โ™ป๏ธ Hot-reload enabled

๐Ÿ”„ Visual Workflow Build AI pipelines

๐ŸŽจ Drag-and-drop editor ๐Ÿงฉ 24 node types ๐Ÿ“‹ Version management ๐Ÿ” Run trace & debug

๐Ÿค– Visual SubAgent Create AI workers

๐ŸŽจ GUI agent creator ๐Ÿ“ Isolated workspaces ๐ŸŽฏ Auto task dispatch ๐Ÿง  Own config & memory

๐Ÿ“š Knowledge Base Your second brain

๐Ÿ“„ Multi-format documents ๐Ÿ“ Markdown notes ๐Ÿ•ธ๏ธ Knowledge graph ๐Ÿง  AI-powered distillation

๐Ÿ“ก Multi-Channel Chat everywhere

๐Ÿ’ฌ Desktop / WeChat ๐Ÿฆ Slack / Discord โœˆ๏ธ Telegram / DingTalk ๐Ÿ“ง Email / Webhook

โฐ Smart Tasks Actually run tasks

โ–ถ๏ธ SubAgent execution ๐Ÿ“… Cron/interval/once ๐Ÿ’ช Survive restarts ๐Ÿ’ฌ Access context

๐Ÿ—‚๏ธ Project Isolation Separate workspaces

โš™๏ธ Per-project config ๐Ÿ”„ Switch instantly ๐Ÿ‘ฅ Export for team ๐Ÿ’ฌ Never lose history

๐Ÿ”Š Text-to-Speech Voice your AI

๐Ÿ—ฃ๏ธ Multiple TTS engines ๐ŸŽต Natural voice output โš™๏ธ Customizable settings ๐Ÿ“ฑ Real-time playback

๐Ÿง  Observation & Memory Learn from experience

๐Ÿ” 9 observation types ๐Ÿ“ Auto-extract insights ๐Ÿ’พ Promote to memory ๐Ÿ‘ค User profile tracking

๐Ÿ“„ Multi-format Files Read any document

๐Ÿ“‘ PDF / DOCX / XLSX ๐Ÿ“Š PPTX preview ๐Ÿ–ผ๏ธ Image understanding ๐Ÿ—œ๏ธ Context compression


๐Ÿ”„ Visual Workflow

Build complex AI pipelines with a drag-and-drop editor powered by ReactFlow:

Node Types (24 kinds)

CategoryNodes
FlowWorkflow Start, Answer, Workflow End
AILLM, Question Classifier, Content Extractor
ToolHTTP Request, Code Execution, Read Files, JSON Serialize/Deserialize, Text Editor
LogicCondition Branch, Variable Update, Loop, Parallel Execution
InteractionUser Select, Form Input, Input, Plugin Output
AgentAgent Node, Sub-Workflow

Key Capabilities

  • Visual Editor: Drag-and-drop canvas with auto-layout
  • Node Testing: Test individual nodes in isolation before running the full workflow
  • Version Management: Save, compare, and restore workflow versions
  • Run Tracing: Step-by-step execution trace with variable inspection
  • Loop Support: Nested loop nodes with dedicated inner canvas
  • Templates: Pre-built templates for common patterns (simple chat, conditional branch)
  • Auto-save: 5-second debounce with dirty state indicator

๐Ÿ“š Knowledge Base

A complete knowledge management system with AI-powered capabilities:

Documents

  • Multi-format upload: PDF, DOCX, XLSX, PPTX, images, and more
  • Chunked upload: Files >2MB automatically split into 2MB chunks (max 500MB)
  • AI Distillation: Extract key insights from documents using AI
  • Batch operations: Batch distill, move, and manage documents
  • Preview: In-app preview for all supported formats
  • Import/Export: ZIP-based import/export, Obsidian vault import

Notes

  • Markdown editor: Full-featured editor with wiki-link navigation
  • Vault system: Create and manage multiple knowledge vaults
  • Obsidian compatible: Import existing Obsidian vaults

Knowledge Graph

  • Visual exploration: WebGL-powered graph visualization (PixiJS)
  • Force-directed layout: Interactive node positioning
  • Relationship mapping: Discover connections between knowledge nodes

๐Ÿ“ก Multi-Channel Support

Connect Octopus to your favorite platforms:

ChannelFeatures
๐Ÿ–ฅ๏ธ DesktopFull-featured native app with WebSocket real-time
๐Ÿ’ฌ WeChatQR code login, send/receive, auto-reply
๐Ÿฆ SlackBolt SDK integration, channel & DM support
๐ŸŽฎ DiscordBot integration, server & channel messaging
โœˆ๏ธ TelegramBot API, chat & group support
๐Ÿ“ฑ DingTalkStream protocol, conversation messaging
๐Ÿ“ง EmailSMTP/IMAP integration
๐Ÿ”— WebhookGeneric HTTP webhook for custom integrations
๐Ÿฆ FeishuLark SDK, event subscription

๐Ÿ”Œ Extension Ecosystem

Skill Extensions (Just Markdown)

Write a SKILL.md file to teach AI new capabilities:

---
name: "Code Review"
emoji: "๐Ÿ”"
---

When reviewing code, check for:
1. Security issues (SQL injection, XSS)
2. Performance bottlenecks
3. Naming conventions

Drop it into workspace/extensions/my-skill/SKILL.md and restart to activate.

Extension Marketplace

  • Browse and install community extensions
  • Three extension types: Skill, Plugin, Worker
  • Search, filter by type, sort by popularity
  • One-click install with environment variable configuration

MCP Protocol Support

  • Connect to any MCP server (stdio / HTTP SSE)
  • Auto-discover tools, no manual configuration needed
  • Visual permission management with enable/disable per tool
  • Real-time connection status monitoring

๐Ÿ› ๏ธ Built-in Tools

CategoryToolsDescription
๐Ÿ“ Filesystemread, write, edit, listFile read/write operations
๐Ÿ–ฅ๏ธ Systemshell, spawnCommand execution
๐ŸŒ Networkweb_fetchWeb content fetching
๐Ÿ–ฅ๏ธ Browserbrowser_navigate, browser_click, browser_screenshot, ...Playwright browser automation
๐Ÿ–ผ๏ธ Imageimage_understand, image_generateAI image processing
โฐ Schedulecron_add, cron_list, cron_removeTask scheduling
๐Ÿ’ฌ Messagesend_messageMulti-channel messaging
๐Ÿง  Memorymemory_read, memory_writeAgent memory operations
๐Ÿ“š Knowledgeknowledge_search, knowledge_queryKnowledge base retrieval
โšก ActionactionExecute extension actions

๐Ÿง  Observation & Memory

Octopus automatically extracts insights from conversations:

Observation Types

TypeDescription
๐ŸŽฏ GotchaKey findings and aha moments
๐Ÿ”ง Problem-SolutionProblem-solution pairs
โš™๏ธ How-it-worksHow something works explanations
๐Ÿ“ What-changedChange records
๐Ÿ” DiscoveryNew discoveries
โ“ Why-it-existsRationale and reasons
๐Ÿ“‹ DecisionDesign decisions
โš–๏ธ Trade-offTrade-off analysis
๐Ÿ’ก GeneralGeneral observations

Memory Features

  • Auto-extraction: AI identifies and extracts observations from conversations
  • Promote to memory: Elevate important observations to long-term memory
  • User profiles: Track user preferences and patterns
  • Contextual depth: View observations with surrounding conversation context

โš™๏ธ Visual Configuration

All configuration has a graphical interface, no YAML required:

Config ItemDescription
Model ProvidersAdd OpenAI/Anthropic/DeepSeek, support multi-provider switching
Agent SettingsModel, max tokens, temperature, max iterations, compression
Channel ConfigWeChat QR login, Telegram bot, Slack app, DingTalk, and more
Tool TogglesEnable/disable tools with one click, set timeout
WorkspaceIsolated workspaces with separate config and memory
Budget LimitSet monthly token limit with over-budget alerts
MultimodalImage understanding, TTS, and other multimodal settings

๐Ÿ’ฐ Token Usage Visualization

Monitor the cost of every conversation in real-time:

  • ๐Ÿ“Š Real-time Stats: Input/output tokens, cache hits, completion tokens, sub-agent usage
  • ๐Ÿ“ˆ Historical Trends: View consumption by day (7/14/30 days)
  • ๐Ÿ“‹ Breakdown Tables: Per-provider and per-model cost analysis
  • โš ๏ธ Budget Alerts: Set limits with automatic warnings

โฐ Smart Scheduled Tasks

Not just notifications, but actual work:

  • SubAgent Execution: Tasks run in isolated agents, performing real operations
  • Flexible Scheduling: Support ISO time, interval seconds, Cron expressions
  • Context Inheritance: Tasks can access session memory from creation time
  • Persistent Storage: Tasks saved in SQLite, survive restarts
  • Channel Delivery: Send task results to specific channels

๐Ÿ—‚๏ธ Workspace Management

Each project has its own isolated workspace:

workspace/
โ”œโ”€โ”€ project-a/          # Project A
โ”‚   โ”œโ”€โ”€ extensions/     # Exclusive extensions
โ”‚   โ”œโ”€โ”€ memory/         # Long-term memory
โ”‚   โ””โ”€โ”€ history/        # Chat history
โ”œโ”€โ”€ project-b/          # Project B
โ”‚   โ””โ”€โ”€ ...
  • Switch workspace = switch complete config and memory
  • Export/import workspaces supported
  • Team sharing: export workspace, colleagues import to use
  • Built-in file browser with Monaco Editor
  • Multi-format preview: PDF, DOCX, XLSX, PPTX, images, Markdown

๐Ÿ’ฌ Chat History

  • All conversations saved in local SQLite
  • 3-level organization: Channel โ†’ Session โ†’ Instance
  • Filter by message type, search across history
  • Return to any historical session anytime
  • Support parallel multi-sessions

๐Ÿค– Visual SubAgent

Create and manage specialized agents through the UI:

  • Visual Editing: Modify SOUL.md to configure role, tools, model
  • One-click Creation: Fill in name to auto-generate template config
  • Isolated Workspace: Each SubAgent has its own config and memory
  • Master-Slave Dispatch: Main agent automatically calls appropriate SubAgent
  • Tool & Extension Binding: Assign specific tools and extensions per agent

๐Ÿš€ Quick Start

Requirements

  • Node.js >= 18
  • Python >= 3.10

Install & Run

# 1. Clone repository
git clone <repository-url>
cd octopus

# 2. Install dependencies
npm install

# 3. Start development mode
npm run dev

๐Ÿ’ก npm run dev starts both:

  • Frontend dev server (http://localhost:3000)
  • Electron desktop window
  • Python backend (auto-started by Electron)

๐Ÿ“ฆ Build & Release

Development Commands

CommandDescription
npm run devDev mode (frontend + Electron)
npm run dev:frontendFrontend dev server only
npm run dev:electronElectron only

Build Commands

CommandDescription
npm run build:frontendBuild React frontend
npm run build:pythonPackage Python backend
npm run buildFull build (frontend + Electron)

Package & Release

CommandDescriptionOutput
npm run distPackage current platformAuto-select by platform
npm run dist:macmacOS packageDMG + ZIP (universal: x64/arm64)
npm run dist:winWindows packageNSIS installer + portable

๐Ÿ“‚ Output: dist-electron/ ๐Ÿ“– Detailed guide: README_BUILD.md


๐Ÿ—๏ธ Project Architecture

octopus/
โ”œโ”€โ”€ agents/                 ๐Ÿง  AI Agent workspace
โ”‚   โ”œโ”€โ”€ code-reviewer/      Code review agent
โ”‚   โ”œโ”€โ”€ common/             Common agent templates
โ”‚   โ””โ”€โ”€ system/             System agent config
โ”‚       โ””โ”€โ”€ avatars/        Agent avatar assets
โ”œโ”€โ”€ backend/                โšก Python backend
โ”‚   โ”œโ”€โ”€ agent/              Agent core logic
โ”‚   โ”‚   โ”œโ”€โ”€ processors/     Streaming / non-streaming / longtask processors
โ”‚   โ”‚   โ”œโ”€โ”€ compressor.py   Context compression
โ”‚   โ”‚   โ”œโ”€โ”€ subagent.py     SubAgent dispatch
โ”‚   โ”‚   โ””โ”€โ”€ observation_*.py Observation extraction & management
โ”‚   โ”œโ”€โ”€ api/                FastAPI service interface
โ”‚   โ”œโ”€โ”€ channels/           Multi-channel support
โ”‚   โ”‚   โ”œโ”€โ”€ desktop/        Desktop channel (WebSocket)
โ”‚   โ”‚   โ”œโ”€โ”€ wechat/         WeChat channel
โ”‚   โ”‚   โ”œโ”€โ”€ feishu/         Feishu/Lark channel
โ”‚   โ”‚   โ”œโ”€โ”€ dingtalk/       DingTalk channel
โ”‚   โ”‚   โ”œโ”€โ”€ slack/          Slack channel
โ”‚   โ”‚   โ”œโ”€โ”€ discord/        Discord channel
โ”‚   โ”‚   โ”œโ”€โ”€ telegram/       Telegram channel
โ”‚   โ”‚   โ”œโ”€โ”€ email/          Email channel
โ”‚   โ”‚   โ””โ”€โ”€ webhook/        Webhook channel
โ”‚   โ”œโ”€โ”€ core/               Core modules
โ”‚   โ”‚   โ”œโ”€โ”€ config/         Configuration & schema
โ”‚   โ”‚   โ”œโ”€โ”€ events/         Event bus system
โ”‚   โ”‚   โ”œโ”€โ”€ longtask/       Long-running task management
โ”‚   โ”‚   โ”œโ”€โ”€ models/         Data models
โ”‚   โ”‚   โ””โ”€โ”€ providers/      LLM provider adapters (OpenAI/Anthropic)
โ”‚   โ”œโ”€โ”€ data/               Data storage (SQLite)
โ”‚   โ”‚   โ”œโ”€โ”€ migrations/     Database migrations (11 migrations)
โ”‚   โ”‚   โ””โ”€โ”€ schema/         Data schemas (agent/session/token/workflow/...)
โ”‚   โ”œโ”€โ”€ extensions/         Plugin system
โ”‚   โ”‚   โ”œโ”€โ”€ builtin/        Built-in extensions (cron, etc.)
โ”‚   โ”‚   โ””โ”€โ”€ loader.py       Dynamic extension loader
โ”‚   โ”œโ”€โ”€ mcp/                MCP protocol integration
โ”‚   โ”‚   โ”œโ”€โ”€ server/         MCP server connection & tool registry
โ”‚   โ”‚   โ””โ”€โ”€ llm_bridge.py   LLM-MCP bridge
โ”‚   โ”œโ”€โ”€ services/           Service layer
โ”‚   โ”‚   โ”œโ”€โ”€ cron/           Scheduled task service
โ”‚   โ”‚   โ”œโ”€โ”€ tts/            Text-to-speech (OpenAI/MiMo engines)
โ”‚   โ”‚   โ”œโ”€โ”€ workflow/       Workflow engine & executor
โ”‚   โ”‚   โ”œโ”€โ”€ knowledge_*.py  Knowledge base services
โ”‚   โ”‚   โ”œโ”€โ”€ image_service.py Image generation service
โ”‚   โ”‚   โ””โ”€โ”€ llm_service.py  LLM invocation service
โ”‚   โ”œโ”€โ”€ tools/              Built-in tools
โ”‚   โ”‚   โ”œโ”€โ”€ filesystem.py   Filesystem tools
โ”‚   โ”‚   โ”œโ”€โ”€ shell.py        Shell tools
โ”‚   โ”‚   โ”œโ”€โ”€ web_fetch.py    Web fetch tools
โ”‚   โ”‚   โ”œโ”€โ”€ browser/        Playwright browser automation
โ”‚   โ”‚   โ”œโ”€โ”€ image.py        Image processing tools
โ”‚   โ”‚   โ”œโ”€โ”€ cron.py         Cron task tools
โ”‚   โ”‚   โ”œโ”€โ”€ message.py      Message tools
โ”‚   โ”‚   โ”œโ”€โ”€ memory.py       Memory read tools
โ”‚   โ”‚   โ”œโ”€โ”€ memory_write.py Memory write tools
โ”‚   โ”‚   โ”œโ”€โ”€ knowledge.py    Knowledge base tools
โ”‚   โ”‚   โ”œโ”€โ”€ action.py       Extension action tools
โ”‚   โ”‚   โ””โ”€โ”€ spawn.py        Process spawn tools
โ”‚   โ””โ”€โ”€ utils/              Utility functions
โ”œโ”€โ”€ electron/               ๐Ÿ–ฅ๏ธ Electron main process
โ”‚   โ”œโ”€โ”€ main.js             Main entry (Python lifecycle, window management)
โ”‚   โ””โ”€โ”€ preload.js          Preload script (IPC bridge)
โ”œโ”€โ”€ frontend/               ๐ŸŽจ React frontend
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ”œโ”€โ”€ pages/          Page components
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ Chat/       Chat interface with streaming & tool display
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ Config/     Settings (providers/agent/channels/multimodal)
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ Workflow/   Visual workflow editor (ReactFlow)
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ Knowledge/  Knowledge base (documents/notes/graph)
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ Agents/     SubAgent management
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ MCP/        MCP server & tool management
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ Extensions/ Extension marketplace
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ Cron/       Scheduled tasks
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ Tokens/     Token usage dashboard
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ History/    Chat history browser
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ Memory/     Observation & memory viewer
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ Workspace/  File browser & editor
โ”‚   โ”‚   โ”œโ”€โ”€ components/     Shared components
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ MessageList/ Message rendering with iteration folds
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ TTSPlayer/  Audio playback
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ TaskIndicator/ Task status indicator
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ MermaidDiagram/ Mermaid chart rendering
โ”‚   โ”‚   โ”œโ”€โ”€ workflow/       Workflow engine
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ components/ Node components (13 registered types)
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ hooks/      Zustand workflow store
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ types/      Type definitions
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ templates/  Workflow templates
โ”‚   โ”‚   โ”œโ”€โ”€ contexts/       React contexts (WebSocket, DistillTask)
โ”‚   โ”‚   โ”œโ”€โ”€ hooks/          Custom hooks (useChatState, useMermaid)
โ”‚   โ”‚   โ””โ”€โ”€ utils/          Utilities
โ”‚   โ””โ”€โ”€ package.json
โ”œโ”€โ”€ build/                  ๐Ÿ”ง Build resources (icons, etc.)
โ”œโ”€โ”€ workspace/              ๐Ÿ“‚ Workspace data (runtime-generated, git-ignored)
โ”œโ”€โ”€ build_python.py         ๐Ÿ Python packaging script (PyInstaller)
โ”œโ”€โ”€ package.json            ๐Ÿ“‹ Project config & scripts
โ””โ”€โ”€ README.md               ๐Ÿ“– Project documentation

Tech Stack

LayerTechnologyDescription
FrontendReact 18 + Vite 5Modern UI framework
Ant Design 6Component library
ReactFlowVisual workflow editor
Monaco EditorCode editor
ECharts 6Data visualization
PixiJSKnowledge graph WebGL rendering
ZustandWorkflow state management
BackendPython 3.10+ + FastAPIHigh-performance async web service
SQLite + SQLAlchemyLocal lightweight database
PlaywrightBrowser automation
APSchedulerTask scheduling
DesktopElectron 28Cross-platform desktop framework
electron-builderApp packaging tool

Runtime Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                Electron Main Process         โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚  BrowserWindow   โ”‚  โ”‚ Python Process   โ”‚  โ”‚
โ”‚  โ”‚  (React SPA)     โ”‚  โ”‚ (octopus-server) โ”‚  โ”‚
โ”‚  โ”‚                  โ”‚  โ”‚                  โ”‚  โ”‚
โ”‚  โ”‚  electronAPI โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”ผโ”€โ”€โ–ถ FastAPI       โ”‚  โ”‚
โ”‚  โ”‚  (preload bridge)โ”‚  โ”‚    (WebSocket)   โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
  • Communication: Full WebSocket between frontend and backend
  • Request-Response: request_id based correlation with timeout
  • Event Subscription: Pub/Sub pattern for real-time events
  • Python Lifecycle: Managed by Electron (auto-start/stop)

๐Ÿ”ง Model Configuration

Add API keys in the app settings panel:

Supported Providers

ProviderRepresentative Models
OpenAIGPT-4o, GPT-4 Turbo, GPT-3.5 Turbo, o1
AnthropicClaude 3 Opus, Claude 3 Sonnet, Claude 3 Haiku
GoogleGemini Pro, Gemini Ultra
DeepSeekDeepSeek Chat, DeepSeek Coder
AlibabaTongyi Qianwen series
BaiduWenxin Yiyan series
CustomAny OpenAI-compatible API endpoint

Configuration Steps

  1. Open app โ†’ Settings โ†’ Model Providers
  2. Add provider (select or custom)
  3. Enter API Key & Base URL
  4. Select model to use
  5. Save and start

๐Ÿ”Œ MCP Protocol

Octopus fully supports Model Context Protocol (MCP):

  • ๐Ÿ”— Connect to any MCP server
  • ๐Ÿ› ๏ธ Use tools provided by MCP
  • ๐Ÿ” Secure permission management
  • ๐Ÿ”„ Real-time connection monitoring
  • ๐Ÿ“‹ Visual server management (add/edit/delete/reconnect)
  • ๐Ÿ” Auto-discovered tools with per-tool enable/disable

Supported Transports

  • stdio: Local process communication
  • HTTP SSE: Server-Sent Events over HTTP

๐Ÿค– Agent Workspace

Agent system supports continuous memory and personalization:

Configuration Files

FilePurpose
SOUL.mdAgent soul - core principles and personality
IDENTITY.mdAgent identity - self-introduction
AGENTS.mdWorkspace guide - usage instructions
MEMORY.mdLong-term memory - important info persistence
memory/YYYY-MM-DD.mdDaily notes - daily event records

Creating Custom Agents

Create new folder in agents/ directory, add config files to create custom agent.


๐Ÿ“‚ Project Structure

octopus/
โ”œโ”€โ”€ backend/              # Python backend (FastAPI)
โ”œโ”€โ”€ frontend/             # React frontend (Vite)
โ”œโ”€โ”€ electron/             # Electron main process
โ”œโ”€โ”€ build/                # Build resources (icons, etc.)
โ”œโ”€โ”€ build_python.py       # Python packaging script
โ”œโ”€โ”€ workspace/            # โš ๏ธ Runtime-generated directory (git-ignored)
โ”‚   โ”œโ”€โ”€ agents/           #   - User-created agent configurations
โ”‚   โ”œโ”€โ”€ extensions/       #   - Installed extensions
โ”‚   โ”œโ”€โ”€ files/            #   - Workspace files
โ”‚   โ”œโ”€โ”€ images/           #   - Generated images
โ”‚   โ””โ”€โ”€ ...               #   - Other runtime data
โ””โ”€โ”€ scripts/              # Helper scripts

Note: The workspace/ directory is created at runtime and contains user data, agent configs, and generated files. It's excluded from version control by .gitignore.


๐Ÿ“– Documentation


๐Ÿค Contributing

Issues and Pull Requests welcome:

  • ๐Ÿ› Bug reports
  • โœจ New features
  • ๐Ÿ“ Documentation improvements
  • ๐ŸŽจ UI/UX optimizations

๐Ÿ“‹ Changelog

2026-05

DateVersionChanges
2026-05-17v1.0.0๐Ÿ”„ New: Visual Workflow editor with 24 node types
2026-05-17v1.0.0๐Ÿ“š New: Knowledge Base with documents, notes, graph
2026-05-17v1.0.0๐Ÿ“ก New: Multi-channel support (Slack/Discord/Telegram/...)
2026-05-17v1.0.0๐ŸŒ New: Playwright browser automation tools
2026-05-17v1.0.0๐Ÿง  New: Observation & memory system

2026-03

DateVersionChanges
2026-03-29v1.0.0๐Ÿ”Š New: Text-to-Speech (TTS) feature support
2026-03-29v1.0.0๐Ÿค– New: SubAgent management and UI improvements
2026-03-28v1.0.0๐Ÿ—œ๏ธ New: Context compression and LLM retry optimization
2026-03-25v1.0.0๐Ÿ“„ New: PDF, DOCX, and Excel file support
2026-03-24v1.0.0๐Ÿ’ฌ New: WeChat channel with QR login and messaging
2026-03-22v1.0.0๐Ÿ–ผ๏ธ New: Frameless window support
2026-03-20v1.0.0๐ŸŽ‰ Release: Project renamed to Octopus

๐Ÿ™ Octopus makes your work more efficient ๐Ÿ™

Built with โค๏ธ and ๐Ÿ™ tentacles