pdf-modifier-mcp

Find and replace text in PDFs — from your terminal or your AI assistant. Font styles preserved, layout untouched.

The problem

PDF editing tools are either bloated GUIs or brittle scripts that destroy formatting. Replacing text in a PDF should not require Adobe Acrobat, a SaaS subscription, or a prayer that your fonts survive.

The solution

pdf-modifier-mcp replaces text at the span level: it redacts the original, then reinserts the replacement with matched font family, weight, size, and color — using Base 14 font mapping (Helvetica, Times, Courier and their bold variants).

Two interfaces, same engine:

CLI (pdf-mod) — for batch jobs, scripting, and CI pipelines.
MCP Server (pdf-modifier-mcp) — so Claude Desktop, Cursor, or any MCP client can edit PDFs autonomously.

Install and run in one line:

pip install pdf-modifier-mcp

Features

MCP Server

Three tools for AI agents: read_pdf_structure, inspect_pdf_fonts, modify_pdf_content. Works with Claude Desktop, Cursor, and any MCP-compatible client. Structured JSON responses with typed error codes.

Command-Line Interface

Typer + Rich powered. Simple replacements (-r "old=new"), regex patterns (--regex), hyperlink creation, font inspection tables, and JSON output for scripting.

Style Preservation

Maps PDF font names to Base 14 families with bold detection. Color, size, and position are carried over from original spans. Layout stays intact — no reflow, no broken pages.

Regex and Hyperlinks

Pattern-based bulk replacements for dates, invoice IDs, prices, and structured text. Create clickable links or neutralize existing ones with text|URL syntax.

Quick start

# Install from PyPI
pip install pdf-modifier-mcp

# Replace text in a PDF
pdf-mod modify input.pdf output.pdf -r "Draft=Final" -r "$99.99=$149.99"

# Use regex for bulk changes
pdf-mod modify input.pdf output.pdf -r "Order #\d+=Order #REDACTED" --regex

# Analyze PDF structure
pdf-mod analyze invoice.pdf --json

MCP tools at a glance

Tool	What it does
`read_pdf_structure`	Returns every text span with position, font, size, and color as JSON
`inspect_pdf_fonts`	Searches for terms and reports font properties — useful before replacements
`modify_pdf_content`	Find and replace with style preservation, regex, and hyperlink support

Architecture

Entry Points               Core Layer                  Engine
+-----------------------+   +-----------------------+   +----------------+
| CLI (Typer + Rich)    |-->| PDFModifier           |-->| PyMuPDF (fitz) |
| MCP Server (FastMCP)  |   | PDFAnalyzer           |   +----------------+
+-----------------------+   | Pydantic v2 models    |
                            +-----------------------+

All interfaces share the same core. The modifier uses a batch-redact strategy — collecting all matches per page, applying redactions once, then inserting styled replacements. No intermediate saves, no re-parsing.

Getting Started Installation, CLI usage, MCP server setup, and integration guide.

Source Code Browse the source, open issues, or contribute on GitHub.