EditDeck

April 15, 2026 · View on GitHub

An end-to-end pipeline from requirement text to slide images, standard PPTX, and editable PPTX.

English Case: Image-only Deck -> Fully Editable Deck -> Editing View

1) Image-only deck	2) Fully editable deck	3) Editing view

Why · Highlights · Public Site · Showcase · Quick Start · Usage · Configuration · FAQ · 中文文档

Why

Making slides usually means bouncing between too many disconnected steps: shaping the story, drafting page content, generating visuals, exporting a deck, and then rebuilding everything again if you need a truly editable presentation.

EditDeck turns that fragmented process into one continuous workflow:

Start from a plain-language requirement and turn it into a structured PPT outline with page-level content
Render every slide into polished visual images and package them into a standard pptx
Continue from the generated run directory or from existing slide images to rebuild an editable pptx
Coordinate text models, image models, editable deck generation, and MinerU parsing through one unified YAML config
Expose the same workflow through Web UI, CLI, and HTTP API so it fits both hands-on usage and system integration

If your ideal workflow is "get the visual draft fast, then keep going until the deck becomes editable and usable," this project is built for exactly that.

Highlights

Single config entry point: the project reads only config/app.yaml by default
Dual workflow support: generate from scratch or re-generate an editable PPT from existing images
Complete editable pipeline: image parsing, element extraction, placeholder matching, and browser export are all wired up
Better cross-platform support: the browser path can be left empty — at runtime it auto-detects from explicit arguments, environment variables, and the system PATH
Straightforward overrides: both CLI arguments and Web/API request parameters can override the config file at runtime

Public Site

We also run a public-interest site at editdeck.top so more people can try the open-source project for free. If the main domain is temporarily unavailable, you can also use the fallback address http://118.195.198.236/.

After starring the project, each user gets 5 free generations
Daily check-in adds 1 extra free generation
Because of server cost limits, you may occasionally run into unexpected errors
If you do, feedback is very welcome

Showcase

Following the side-by-side comparison style used by PPTAgent, each case is shown in two columns: the left side is the image-only result, and the right side is the fully editable reconstruction.

Case 1 · AI Productivity Playbook for Knowledge Teams

Prompt

Create a professional presentation in English titled "AI Productivity Playbook for Knowledge Teams". Include practical workflows, tool stacks, ROI metrics, and a 90-day rollout plan. Keep all slide text strictly in English.

Image-only Deck	Fully Editable Deck

Case 2

Prompt

做一份《AI 客服知识库升级方案》PPT，面向企业数字化与客服平台主管，围绕现状痛点、升级目标、知识中台架构、问答流程优化、运营指标与实施计划展开，适合六页呈现。

Style

PPT 整体呈现 16:9 宽屏深色科技风，主色以深海军蓝、冷青蓝、电光蓝为核心，辅以少量高亮青色作为数据强调色，背景以深蓝黑渐变、细密网格、弱发光线条和玻璃质感面板构成。字体采用思源黑体 / 苹方 / 微软雅黑体系，标题更厚重，正文更克制，层级鲜明。页面强调中轴对齐与模块化网格系统，常用大标题横向锚点、分栏数据卡、发光描边图表、半透明信息面板与线性科技图标。整体气质冷静、专业、偏未来感，但必须保持高可读性，避免赛博朋克式杂乱，避免大面积紫色，强调企业级 AI 产品汇报的理性秩序。

Image-only Deck	Fully Editable Deck

Case 3

Prompt

做一份《企业 Copilot 落地的双引擎实施路线图》PPT，面向集团管理层汇报，围绕数据治理引擎与业务应用引擎两条主线，讲清建设背景、核心痛点、总体架构、分阶段路线图、试点场景、投入产出与风险控制，适合六页呈现。

Style

PPT 整体呈现16:9 宽屏学术商务风，以微软红蓝为主色调，搭配浅灰与蓝灰底色，采用思源黑体 / 苹方 / 微软雅黑体系，层级清晰、信息密度适中。页面采用顶部标题栏 + 左侧纵向时间轴 + 右侧斜向分区双引擎的差异化构图，所有卡片统一 18–22px 圆角与轻量阴影，模块标题使用胶囊色块，重要容器配以红蓝虚线外框，搭配统一线宽线性图标强化视觉规范。

Image-only Deck	Fully Editable Deck

Quick Start

1. Install Dependencies

pip install -r requirements.txt

2. Edit Configuration

Edit config/app.yaml.

The api_key fields are left empty by default in the project template
You can keep using the base_url addresses already in the file
See config/README.md for a more complete field reference

3. Choose How to Run

Start the web server:

uvicorn webapp.main:app --host 0.0.0.0 --port 8000 --reload

Open in your browser:

http://127.0.0.1:8000/

Or use the CLI directly:

python -m app.cli generate "Create an AI office productivity improvement plan"

Usage

Web

The web entry point is provided by webapp/main.py, ideal for filling in requirements, styles, and runtime parameters directly on the page.

Style input supports two modes: style_description (text) or style_template (reference image), choose one
Upload reference files via source_files (multiple files): .txt / .md / .pdf / .docx
In replica mode, upload source slide images via replica_images (multiple files): .png / .jpg / .jpeg / .webp

CLI

Generate images and a standard PPT only:

python -m app.cli generate "Create an AI office productivity improvement plan" \
  --slide-count auto \
  --export-mode both

Generate a PPT with requirements and supplementary reference documents (supports pdf, docx, txt, md, etc.):

python -m app.cli generate "基于我的文档中的国内外研究现状，做一份现状分析ppt" \
  --slide-count auto \
  --export-mode both \
  --source-file data.docx

Generate with style and uploaded reference files:

python -m app.cli generate "Create an AI customer service knowledge base upgrade deck" \
  --style-description "16:9 dark tech style, blue-cyan palette, high readability" \
  --source-file ./docs/brief.md \
  --source-file ./docs/customer_faq.pdf \
  --export-mode both

Generate a standard PPT and then continue to produce an editable PPT:

python -m app.cli generate "Create an AI office productivity improvement plan" \
  --editable-ppt \
  -edit

Continue generating an editable PPT from an existing run directory:

python -m app.cli editable \
  --run-dir ./generated/<run_id> \
  --output-dir ./generated/<run_id>/editable_deck \
  -edit

Generate an editable PPT directly from existing images:

python -m app.cli editable \
  --image ./generated/run_xxx/slide_01.png \
  --image ./generated/run_xxx/slide_02.png \
  --output-dir ./generated/run_xxx/editable_deck \
  -edit

Common parameters:

--config-file: specify a config file; defaults to config/app.yaml
--style-description: specify style via text
--style-template: specify style via image
--source-file: upload reference files (repeatable)
--editable-ppt: continue generating an editable PPT after image generation
-edit / --edit: enable the currently available editable asset matching backend
--mineru-api-key: override mineru.api_key as needed
--force-reextract-assets: force re-extraction of elements
--disable-asset-reuse: prevent a single asset from being reused across multiple placeholders

Notes:

--style-description and --style-template are mutually exclusive
CLI arguments take priority over YAML configuration

HTTP API

Main endpoints:

GET /api/health: health check
POST /api/generate: synchronous generation
POST /api/generate/start: asynchronous generation
POST /api/editable/start: start an editable PPT task from an existing run_id
GET /api/generate/status/{job_id}: query async task status

To produce an editable PPT directly during the generation phase, include the following in your request:

generate_editable_ppt=true
asset_backend=edit

MinerU is required for editable PPT asset extraction, and for parsing image-only (scanned) PDF reference files. When config/app.yaml does not have a usable mineru.api_key, pass mineru_api_key explicitly in the request.

Configuration

The project keeps a single main config file:

config/app.yaml

Config sections:

app: output directory and default slide count
models.text: models for outline and copy generation
models.editable: the editable PPT generation pipeline
models.image: image generation model
mineru: page element parsing and asset extraction

Useful fallback rules:

If models.image.api_key is empty, it falls back to models.text.api_key
If models.editable.base_url is empty, it falls back to models.text.base_url
If models.editable.api_key is empty, it falls back to models.text.api_key
If mineru.api_key is empty, it further falls back to models.editable.api_key and then models.text.api_key
models.editable.browser_path can be left empty — at runtime it tries explicit arguments, environment variables, and the system PATH

For a complete example and field reference, see config/README.md.

FAQ

Editable PPT reports a missing key

Check the following in order:

mineru.api_key in config/app.yaml
--mineru-api-key in CLI arguments
mineru_api_key in Web / API requests

Browser execution or download fails

Troubleshoot in this order:

First, leave models.editable.browser_path empty
If you need to specify a browser explicitly, pass --editable-browser-path
Or set one of these environment variables: EDITABLE_PPT_BROWSER_PATH, CHROME_PATH, GOOGLE_CHROME_BIN, CHROMIUM_PATH, BROWSER_PATH
If no browser is available on the system, run playwright install chromium

Placeholder replacement looks off

Try:

Increase mineru.max_refine_depth
Enable --force-reextract-assets
Enable --disable-asset-reuse

Just want to reuse existing assets

Use --assets-json, though this is currently best suited for directly specifying an existing assets.json in single-image mode.

Let's Connect

We welcome conversations with researchers, developers, and students who are interested in editable presentation generation, document intelligence, and practical deployment workflows.

If you would like to discuss technical ideas, benchmarking results, implementation details, or related research and engineering experience, feel free to reach out. We would be glad to exchange thoughts and learn from one another.

WeChat QR code