README.md

May 18, 2026 · View on GitHub

PDF Monster

AI-agent PDF analysis skill and Codex plugin

PDFs in, model-readable evidence out.

Features • Install • CLI Usage • Output • Agent Skill • License

PDF Monster turns PDFs into model-readable evidence: extracted text, optional OCR text, rendered page images, and embedded image files. It is built for agents that need to inspect PDFs without dumping generated folders into the user's project.

What It Does

Extracts per-page text from PDFs
Renders pages to PNG when layout or visual inspection matters
Runs optional OCR through Tesseract
Extracts embedded images for figures and screenshots
Emits a structured JSON manifest for agents to read
Avoids creating output/, pages/, or similar folders unless explicitly requested

Install

Install As A Codex Plugin

Add this repository to Codex:

codex plugin marketplace add jbaehova/pdf-monster

Then install or enable PDF Monster from Codex's Plugins UI.

For local development, add this checkout directly:

codex plugin marketplace add /absolute/path/to/pdf-monster

This repository is the installable Codex plugin package. Its plugin files follow the same root-level layout used by simple Codex plugins:

.codex-plugin/plugin.json
.claude-plugin/marketplace.json
plugin -> .
assets/pdf-monster.svg
skills/pdf-monster/SKILL.md
skills/pdf-monster/scripts/analyze_pdf.py

plugin is a compatibility symlink to the repository root. It keeps the installable package at the root while giving Codex a non-empty marketplace source path.

After these files are pushed to GitHub, users can add the plugin with codex plugin marketplace add jbaehova/pdf-monster.

Install As A Standalone Skill

Clone this repository, then copy or reference the skill package at skills/pdf-monster:

git clone https://github.com/jbaehova/pdf-monster.git pdf-monster

Python 3 is required. On first use, the skill tells the agent to check for PyMuPDF and install the recommended Python dependency when it is missing and pip/network installs are allowed:

python3 -m pip install -r /absolute/path/to/pdf-monster/skills/pdf-monster/requirements.txt

If you run the CLI yourself, install it once from the repo root:

python3 -m pip install -r skills/pdf-monster/requirements.txt

Optional system tools are not installed automatically:

Poppler: pdfinfo, pdftotext, pdftoppm, pdfimages
Tesseract: tesseract plus language data such as eng or kor

Use As An Agent Skill

Install the skill folder, not just SKILL.md, because the skill uses scripts/analyze_pdf.py.

Common locations:

Claude Code:   ~/.claude/skills/pdf-monster
Codex:         ~/.codex/skills/pdf-monster
Pi:            ~/.pi/agent/skills/pdf-monster or ~/.agents/skills/pdf-monster
OpenClaw:      ~/.openclaw/skills/pdf-monster or <workspace>/skills/pdf-monster
Hermes:        ~/.hermes/skills/pdf-monster

For agents without native skill discovery, point custom instructions at the absolute path to the nested SKILL.md and tell the agent to run:

python3 /absolute/path/to/pdf-monster/skills/pdf-monster/scripts/analyze_pdf.py <file.pdf> --json

CLI Usage

Basic analysis:

python3 skills/pdf-monster/scripts/analyze_pdf.py file.pdf --json

Visual or scanned PDFs:

python3 skills/pdf-monster/scripts/analyze_pdf.py file.pdf --render-pages all --ocr auto --json

Slide decks or PDFs with repeated logos/icons:

python3 skills/pdf-monster/scripts/analyze_pdf.py file.pdf --render-pages all --min-image-area 10000 --dedupe-images --json

Korean and English OCR:

python3 skills/pdf-monster/scripts/analyze_pdf.py file.pdf --render-pages all --ocr auto --ocr-lang kor+eng --json

Text-only mode:

python3 skills/pdf-monster/scripts/analyze_pdf.py file.pdf --render-pages none --no-extract-images --ocr never --json

Selected pages:

python3 skills/pdf-monster/scripts/analyze_pdf.py file.pdf --pages 1,3-5 --render-pages all --json

Persist artifacts when you actually want files kept:

python3 skills/pdf-monster/scripts/analyze_pdf.py file.pdf --save-to ./pdf-monster-artifacts --json

Output

The script prints JSON with fields such as:

page_count
selected_pages
backend
artifact_root
cleanup_command
pages_needing_visual_review
pages[].text
pages[].ocr_text
pages[].render_path
pages[].embedded_images
pages[].needs_visual_review
pages[].visual_review_reasons
pages[].warnings

If temporary artifacts are created, the JSON includes a cleanup_command. Run it only after the image paths are no longer needed.

PyMuPDF is the preferred backend. If it is unavailable, PDF Monster falls back to Poppler CLI tools where possible. OCR is optional; when Tesseract is missing, the script reports a warning and continues with text extraction and page rendering.

License

MIT. See LICENSE.