README.md
May 18, 2026 · View on GitHub
PDF Monster
AI-agent PDF analysis skill and Codex plugin
PDFs in, model-readable evidence out.
Features • Install • CLI Usage • Output • Agent Skill • License
PDF Monster turns PDFs into model-readable evidence: extracted text, optional OCR text, rendered page images, and embedded image files. It is built for agents that need to inspect PDFs without dumping generated folders into the user's project.
What It Does
- Extracts per-page text from PDFs
- Renders pages to PNG when layout or visual inspection matters
- Runs optional OCR through Tesseract
- Extracts embedded images for figures and screenshots
- Emits a structured JSON manifest for agents to read
- Avoids creating
output/,pages/, or similar folders unless explicitly requested
Install
Install As A Codex Plugin
Add this repository to Codex:
codex plugin marketplace add jbaehova/pdf-monster
Then install or enable PDF Monster from Codex's Plugins UI.
For local development, add this checkout directly:
codex plugin marketplace add /absolute/path/to/pdf-monster
This repository is the installable Codex plugin package. Its plugin files follow the same root-level layout used by simple Codex plugins:
.codex-plugin/plugin.json
.claude-plugin/marketplace.json
plugin -> .
assets/pdf-monster.svg
skills/pdf-monster/SKILL.md
skills/pdf-monster/scripts/analyze_pdf.py
plugin is a compatibility symlink to the repository root. It keeps the installable package at the root while giving Codex a non-empty marketplace source path.
After these files are pushed to GitHub, users can add the plugin with codex plugin marketplace add jbaehova/pdf-monster.
Install As A Standalone Skill
Clone this repository, then copy or reference the skill package at skills/pdf-monster:
git clone https://github.com/jbaehova/pdf-monster.git pdf-monster
Python 3 is required. On first use, the skill tells the agent to check for PyMuPDF and install the recommended Python dependency when it is missing and pip/network installs are allowed:
python3 -m pip install -r /absolute/path/to/pdf-monster/skills/pdf-monster/requirements.txt
If you run the CLI yourself, install it once from the repo root:
python3 -m pip install -r skills/pdf-monster/requirements.txt
Optional system tools are not installed automatically:
- Poppler:
pdfinfo,pdftotext,pdftoppm,pdfimages - Tesseract:
tesseractplus language data such asengorkor
Use As An Agent Skill
Install the skill folder, not just SKILL.md, because the skill uses scripts/analyze_pdf.py.
Common locations:
Claude Code: ~/.claude/skills/pdf-monster
Codex: ~/.codex/skills/pdf-monster
Pi: ~/.pi/agent/skills/pdf-monster or ~/.agents/skills/pdf-monster
OpenClaw: ~/.openclaw/skills/pdf-monster or <workspace>/skills/pdf-monster
Hermes: ~/.hermes/skills/pdf-monster
For agents without native skill discovery, point custom instructions at the absolute path to the nested SKILL.md and tell the agent to run:
python3 /absolute/path/to/pdf-monster/skills/pdf-monster/scripts/analyze_pdf.py <file.pdf> --json
CLI Usage
Basic analysis:
python3 skills/pdf-monster/scripts/analyze_pdf.py file.pdf --json
Visual or scanned PDFs:
python3 skills/pdf-monster/scripts/analyze_pdf.py file.pdf --render-pages all --ocr auto --json
Slide decks or PDFs with repeated logos/icons:
python3 skills/pdf-monster/scripts/analyze_pdf.py file.pdf --render-pages all --min-image-area 10000 --dedupe-images --json
Korean and English OCR:
python3 skills/pdf-monster/scripts/analyze_pdf.py file.pdf --render-pages all --ocr auto --ocr-lang kor+eng --json
Text-only mode:
python3 skills/pdf-monster/scripts/analyze_pdf.py file.pdf --render-pages none --no-extract-images --ocr never --json
Selected pages:
python3 skills/pdf-monster/scripts/analyze_pdf.py file.pdf --pages 1,3-5 --render-pages all --json
Persist artifacts when you actually want files kept:
python3 skills/pdf-monster/scripts/analyze_pdf.py file.pdf --save-to ./pdf-monster-artifacts --json
Output
The script prints JSON with fields such as:
page_countselected_pagesbackendartifact_rootcleanup_commandpages_needing_visual_reviewpages[].textpages[].ocr_textpages[].render_pathpages[].embedded_imagespages[].needs_visual_reviewpages[].visual_review_reasonspages[].warnings
If temporary artifacts are created, the JSON includes a cleanup_command. Run it only after the image paths are no longer needed.
Notes
PyMuPDF is the preferred backend. If it is unavailable, PDF Monster falls back to Poppler CLI tools where possible. OCR is optional; when Tesseract is missing, the script reports a warning and continues with text extraction and page rendering.
License
MIT. See LICENSE.