MCP Code Execution - Enhanced Edition
November 21, 2025 ยท View on GitHub
99.6% Token Reduction through CLI-based scripts and progressive tool discovery for Model Context Protocol (MCP) servers.
Note: This project is optimized for Claude Code with native Skills support. The core runtime works with any AI agent. Scripts with CLI arguments achieve 99.6% token reduction.
๐ฏ What This Is
An enhanced implementation of Anthropic's Code Execution with MCP pattern, optimized for Claude Code, combining the best ideas from the MCP community and adding significant improvements:
- Scripts with CLI Args: Reusable Python workflows with command-line parameters (99.6% token reduction)
- Multi-Transport: Full support for stdio, SSE, and HTTP MCP servers
- Container Sandboxing: Optional rootless isolation with security controls
- Type Safety: Pydantic models throughout with full validation
- Production-Ready: 129 passing tests, comprehensive error handling
๐ค Claude Code Integration
Native Skills Support: This project includes proper Claude Code Skills integration:
.claude/skills/- Skills in Claude Code's native format (SKILL.md + workflow.py)- Auto-discovery - Claude Code automatically finds and validates Skills
- 2 Generic Examples - simple-fetch, multi-tool-pipeline (templates for custom workflows)
- Format Compliant - YAML frontmatter, validation rules, progressive disclosure
Dual-layer architecture:
- Layer 1: Claude Code Skills (
.claude/skills/) - Native discovery and format - Layer 2: Scripts (
./scripts/) - CLI-based Python workflows with argparse
Token efficiency:
- Core runtime: 98.7% reduction (Anthropic's filesystem pattern)
- Scripts with CLI args: 99.6% reduction (no file editing needed)
Note: Scripts work with any AI agent. Claude Code Skills provide native auto-discovery for Claude Code users.
๐ Acknowledgments
This project builds upon and merges ideas from:
-
ipdelete/mcp-code-execution - Original implementation of Anthropic's PRIMARY pattern
- Filesystem-based progressive disclosure
- Type-safe Pydantic wrappers
- Schema discovery system
- Lazy server connections
-
elusznik/mcp-server-code-execution-mode - Production security patterns
- Container sandboxing architecture
- Comprehensive security controls
- Production deployment patterns
Our contribution: Merged the best of both, added CLI-based scripts pattern, implemented multi-transport support, and refined the architecture for maximum efficiency.
โจ Key Enhancements
1. Claude Code Skills Integration (NEW)
Native Skills format in .claude/skills/ directory:
.claude/skills/
โโโ simple-fetch/
โ โโโ SKILL.md # YAML frontmatter + markdown instructions
โ โโโ workflow.py # โ symlink to ../../scripts/simple_fetch.py
โโโ multi-tool-pipeline/
โโโ SKILL.md # Multi-tool orchestration example
โโโ workflow.py # โ symlink to ../../scripts/multi_tool_pipeline.py
How it works:
- Claude Code auto-discovers Skills in
.claude/skills/ - Reads SKILL.md (follows Claude Code's format spec)
- Executes workflow.py (which is a script) with CLI arguments
- Returns results
Benefits:
- โ Native Claude Code discovery
- โ Standard SKILL.md format (YAML + markdown)
- โ Validation compliant (name, description rules)
- โ Progressive disclosure compatible
- โ Generic examples as templates
Documentation: See .claude/skills/README.md for details
2. Scripts with CLI Arguments (99.6% Token Reduction)
CLI-based Python workflows that agents execute with parameters:
# Simple example (generic template)
uv run python -m runtime.harness scripts/simple_fetch.py \
--url "https://example.com"
# Pipeline example (generic template)
uv run python -m runtime.harness scripts/multi_tool_pipeline.py \
--repo-path "." \
--max-commits 5
Benefits over writing scripts from scratch:
- 18x better tokens: 110 vs 2,000
- 24x faster: 5 seconds vs 2 minutes
- Immutable templates: No file editing
- Reusable workflows: Same logic, different parameters
What's included:
- 2 generic template scripts (simple_fetch.py, multi_tool_pipeline.py)
- Complete pattern documentation
2. Multi-Transport Support (NEW)
Full support for all MCP transport types:
{
"mcpServers": {
"local-tool": {
"type": "stdio",
"command": "uvx",
"args": ["mcp-server-git"]
},
"jina": {
"type": "sse",
"url": "https://mcp.jina.ai/sse",
"headers": {"Authorization": "Bearer YOUR_KEY"}
},
"exa": {
"type": "http",
"url": "https://mcp.exa.ai/mcp",
"headers": {"x-api-key": "YOUR_KEY"}
}
}
}
3. Container Sandboxing (Enhanced)
Optional rootless container execution with comprehensive security:
# Sandbox mode with security controls
uv run python -m runtime.harness workspace/script.py --sandbox
Security features:
- Rootless execution (UID 65534:65534)
- Network isolation (--network none)
- Read-only root filesystem
- Memory/CPU/PID limits
- Capability dropping (--cap-drop ALL)
- Timeout enforcement
๐ Installation
System Requirements
- Python 3.11 or 3.12 (3.14 not recommended due to anyio compatibility issues)
- uv package manager (v0.5.0+)
- Claude Code (optional, for Skills auto-discovery)
- Git (for cloning repository)
- Docker or Podman (optional, for sandbox mode)
Step 1: Install uv
If you don't have uv installed:
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows (PowerShell)
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# Verify installation
uv --version
Step 2: Clone and Install
# Clone repository
git clone https://github.com/yourusername/mcp-code-execution-enhanced.git
cd mcp-code-execution-enhanced
# Install dependencies (creates .venv automatically)
uv sync
# Verify installation
uv run python -c "from runtime.mcp_client import get_mcp_client_manager; print('โ Installation successful')"
Step 3: Create MCP Configuration
Important for Claude Code Users: This project uses its own
mcp_config.jsonfor MCP server configuration, separate from Claude Code's global configuration (~/.claude.json). To avoid conflicts, use different servers in each configuration or disable overlapping servers in~/.claude.jsonwhile using this project.
Create mcp_config.json from the example:
# Copy example config (includes git + fetch for examples)
cp mcp_config.example.json mcp_config.json
This config works out of the box:
{
"mcpServers": {
"git": {
"type": "stdio",
"command": "uvx",
"args": ["mcp-server-git", "--repository", "."]
},
"fetch": {
"type": "stdio",
"command": "uvx",
"args": ["mcp-server-fetch"]
}
},
"sandbox": {
"enabled": false
}
}
To add more servers: Edit mcp_config.json and add your own MCP servers. See docs/TRANSPORTS.md for examples of stdio, SSE, and HTTP transports.
Step 4: Generate Tool Wrappers
# Auto-generate typed Python wrappers from your MCP servers
uv run mcp-generate
# This creates ./servers/<server_name>/<tool>.py files
# Example: servers/git/git_log.py, servers/fetch/fetch.py
Step 5: Test the Installation
# Test with a simple script
uv run python -m runtime.harness scripts/simple_fetch.py --url "https://example.com"
# If you configured a git server, test the pipeline
uv run python -m runtime.harness scripts/multi_tool_pipeline.py --repo-path "." --max-commits 5
Step 6 (Optional): Setup Sandbox Mode
If you want to use container sandboxing:
# Install Podman (recommended, rootless)
sudo apt-get install -y podman # Ubuntu/Debian
brew install podman # macOS
# OR install Docker
curl -fsSL https://get.docker.com | sh
# Verify
podman --version # or docker --version
# Test sandbox mode
uv run python -m runtime.harness scripts/simple_fetch.py --url "https://example.com" --sandbox
Step 7 (Optional): Claude Code Skills Setup
If using Claude Code, the Skills are already configured in .claude/skills/ and will be auto-discovered. No additional setup needed!
To use:
- Claude Code will automatically find Skills in
.claude/skills/ - Just ask Claude to use them naturally
- Example: "Fetch https://example.com" โ Claude discovers and uses simple-fetch Skill
๐ How It Works
PREFERRED: Scripts with CLI Args (99.6% reduction)
For multi-step workflows (research, data processing, synthesis):
- Discover scripts:
ls ./scripts/โ see available script templates - Read documentation:
cat ./scripts/simple_fetch.pyโ see CLI args and pattern - Execute with parameters:
uv run python -m runtime.harness scripts/simple_fetch.py \ --url "https://example.com"
Generic template scripts (scripts/):
simple_fetch.py- Basic single-tool execution patternmulti_tool_pipeline.py- Multi-tool chaining pattern
Note: These are templates - use them as examples to create workflows for your specific MCP servers and use cases.
ALTERNATIVE: Direct Script Writing (98.7% reduction)
For simple tasks or novel workflows:
- Explore tools:
ls ./servers/โ discover available MCP tools - Write script: Create Python script using tool imports
- Execute:
uv run python -m runtime.harness workspace/script.py
Example script:
import asyncio
from runtime.mcp_client import call_mcp_tool
async def main():
result = await call_mcp_tool(
"git__git_log",
{"repo_path": ".", "max_count": 10}
)
print(f"Fetched {len(result)} commits")
return result
if __name__ == "__main__":
asyncio.run(main())
๐๏ธ Architecture
Progressive Disclosure Pattern
Traditional Approach (High Token Usage):
Agent โ MCP Server โ [Full Tool Schemas 27,300 tokens] โ Agent
Scripts with CLI Args (99.6% Reduction - PREFERRED):
Agent โ Discovers scripts โ Reads script docs โ Executes with CLI args
Script โ Multi-server orchestration โ Returns results
Tokens: ~110 (script discovery + documentation)
Time: ~5 seconds
Script Writing (98.7% Reduction - ALTERNATIVE):
Agent โ Discovers tools โ Writes script
Script โ MCP Server โ Returns data
Agent โ Processes/summarizes
Tokens: ~2,000 (tool discovery + script writing)
Time: ~2 minutes
Key Components
runtime/mcp_client.py: Lazy-loading MCP client manager with multi-transport supportruntime/harness.py: Dual-mode script execution (direct/sandbox)runtime/generate_wrappers.py: Auto-generate typed wrappers from MCP schemasruntime/sandbox/: Container sandboxing with security controlsscripts/: CLI-based workflow templates with 2 generic examples
๐ Scripts System
Philosophy
DON'T: Write scripts from scratch each time DO: Use pre-written scripts with CLI arguments
Creating Custom Scripts
"""
SCRIPT: Your Script Name
DESCRIPTION: What it does
CLI ARGUMENTS:
--query Research query (required)
--limit Max results (default: 10)
USAGE:
uv run python -m runtime.harness scripts/your_script.py \
--query "your question" \
--limit 5
"""
import argparse
import asyncio
import sys
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("--query", required=True)
parser.add_argument("--limit", type=int, default=10)
# Filter script path from args
args_to_parse = [arg for arg in sys.argv[1:] if not arg.endswith(".py")]
return parser.parse_args(args_to_parse)
async def main():
args = parse_args()
# Your workflow logic here
return result
if __name__ == "__main__":
asyncio.run(main())
See scripts/README.md for complete documentation.
๐ Multi-Transport Support
stdio (Subprocess-based)
{
"type": "stdio",
"command": "uvx",
"args": ["mcp-server-name"],
"env": {"API_KEY": "your-key"}
}
SSE (Server-Sent Events)
{
"type": "sse",
"url": "https://mcp.example.com/sse",
"headers": {"Authorization": "Bearer YOUR_KEY"}
}
HTTP (Streamable HTTP)
{
"type": "http",
"url": "https://mcp.example.com/mcp",
"headers": {"x-api-key": "YOUR_KEY"}
}
See docs/TRANSPORTS.md for detailed information.
๐ Sandbox Mode
Configuration
{
"sandbox": {
"enabled": true,
"runtime": "auto",
"image": "python:3.11-slim",
"memory_limit": "512m",
"timeout": 30
}
}
Security Controls
- Rootless execution: UID 65534:65534 (nobody)
- Network isolation:
--network none - Filesystem: Read-only root, writable tmpfs
- Resource limits: Memory, CPU, PID constraints
- Capabilities: All dropped (
--cap-drop ALL) - Security:
no-new-privileges, SELinux labels
See SECURITY.md for complete security documentation.
๐งช Testing
# Run all tests (129 total)
uv run pytest
# Unit tests only
uv run pytest tests/unit/
# Integration tests (requires Docker/Podman for sandbox tests)
uv run pytest tests/integration/
# With coverage
uv run pytest --cov=src/runtime
๐ Documentation
README.md(this file) - Overview and quick startCLAUDE.md- Quick reference for Claude CodeAGENTS.md.template- Template for adapting to other AI frameworksscripts/README.md- Scripts system guidescripts/SKILLS.md- Complete scripts documentationdocs/USAGE.md- Comprehensive user guidedocs/ARCHITECTURE.md- Technical architecturedocs/CONFIGURATION.md- MCP server configuration management (Claude Code vs project)docs/TRANSPORTS.md- Transport-specific detailsSECURITY.md- Security architecture and best practices
๐ ๏ธ Development
Code Quality
# Type checking
uv run mypy src/
# Formatting
uv run black src/ tests/
# Linting
uv run ruff check src/ tests/
Project Scripts
# Generate wrappers from tool definitions
uv run mcp-generate
# (Optional) Generate discovery config with LLM parameter generation
uv run mcp-generate-discovery
# (Optional) Execute safe tools and infer schemas
uv run mcp-discover
# Execute a script with MCP tools available
uv run mcp-exec workspace/script.py
# Execute in sandbox mode
uv run mcp-exec workspace/script.py --sandbox
๐ Efficiency Comparison
| Approach | Tokens | Time | Use Case |
|---|---|---|---|
| Traditional | 27,300 | N/A | All tool schemas loaded upfront |
| Scripts with CLI Args | 110 | 5 sec | Multi-step workflows (PREFERRED) |
| Script Writing | 2,000 | 2 min | Novel workflows (ALTERNATIVE) |
Scripts with CLI args achieve 99.6% reduction - exceeding Anthropic's 98.7% target!
๐จ What Makes This Enhanced
Beyond Original Projects
From ipdelete/mcp-code-execution:
- โ Filesystem-based progressive disclosure
- โ Type-safe Pydantic wrappers
- โ Lazy server connections
- โ Schema discovery system
From elusznik/mcp-server-code-execution-mode:
- โ Container sandboxing architecture
- โ Security controls and policies
- โ Production deployment patterns
Enhanced in this project:
- โญ CLI-based scripts: CLI-based immutable templates (99.6% reduction)
- โญ Multi-transport: stdio + SSE + HTTP support (100% server coverage)
- โญ Dual-mode execution: Direct (fast) + Sandbox (secure)
- โญ Python 3.11 stable: Avoiding 3.14 anyio compatibility issues
- โญ Comprehensive testing: 129 tests covering all features
- โญ Enhanced documentation: Complete guides for all features
Architecture Innovations
Scripts with CLI Arguments:
- Scripts are immutable templates executed with CLI arguments
- No file editing required (parameters via
--query,--num-urls, etc.) - Reusable across different queries and contexts
- Pre-tested and documented workflows
Multi-Transport:
- Single codebase supports all transport types
- Automatic transport detection
- Unified configuration format
- Seamless server connections
Dual-Mode Execution:
- Direct mode: Fast, full access (development)
- Sandbox mode: Secure, isolated (production)
- Same code, different security postures
- Runtime selection via flag or config
๐ง Configuration Reference
Minimal Configuration
{
"mcpServers": {
"git": {
"command": "uvx",
"args": ["mcp-server-git", "--repository", "."]
}
}
}
Complete Configuration
{
"mcpServers": {
"local-stdio": {
"type": "stdio",
"command": "uvx",
"args": ["mcp-server-name"],
"env": {"API_KEY": "key"},
"disabled": false
},
"remote-sse": {
"type": "sse",
"url": "https://mcp.example.com/sse",
"headers": {"Authorization": "Bearer KEY"},
"disabled": false
},
"remote-http": {
"type": "http",
"url": "https://mcp.example.com/mcp",
"headers": {"x-api-key": "KEY"},
"disabled": false
}
},
"sandbox": {
"enabled": false,
"runtime": "auto",
"image": "python:3.11-slim",
"memory_limit": "512m",
"cpu_limit": "1.0",
"timeout": 30,
"max_timeout": 120
}
}
๐ฆ Features
Core Features
- ๐ฆฅ Lazy Loading: Servers connect only when tools are called
- ๐ Type Safety: Pydantic models for all tool inputs/outputs
- ๐ Defensive Coding: Handles variable MCP response structures
- ๐ฆ Auto-generated Wrappers: Typed Python functions from MCP schemas
- ๐ ๏ธ Field Normalization: Handles inconsistent API casing
Enhanced Features
- ๐ฏ Scripts Pattern: Pattern for CLI-based reusable workflows
- ๐ Multi-Transport: stdio, SSE, and HTTP support
- ๐ Container Sandboxing: Optional rootless isolation
- ๐งช Comprehensive Testing: 129 tests with full coverage
- ๐ Complete Documentation: Guides for every feature
๐ Examples
See the examples/ directory for:
example_progressive_disclosure.py- Classic token reduction patternexample_tool_chaining.py- LLM orchestration patternexample_sandbox_usage.py- Container sandboxing demoexample_sandbox_simple.py- Basic sandbox usage
See the scripts/ directory for production-ready workflows.
๐ Troubleshooting
Common Issues
"MCP server not configured"
- Check
mcp_config.jsonserver names match your calls
"Connection closed"
- Verify server command:
which <command> - Check server logs for startup errors
"Module not found"
- Run
uv run mcp-generateto regenerate wrappers - Ensure
src/is in PYTHONPATH (harness handles this)
Import errors in skills
- Skills must be run via harness (sets PYTHONPATH)
- Don't run skills directly:
python scripts/script.pyโ - Correct:
uv run python -m runtime.harness scripts/script.pyโ
Python Version Issues
Python 3.14 compatibility:
- Not recommended due to anyio <4.9.0 breaking changes
- Use Python 3.11 or 3.12 for stability
- See issue tracker for updates
๐ค Contributing
We welcome contributions! Areas of interest:
- New skills: Add more workflow templates
- MCP server support: Test with different servers
- Documentation: Improve guides and examples
- Testing: Expand test coverage
- Performance: Optimize token usage further
Development Setup
# Install with dev dependencies
uv sync --all-extras
# Run quality checks
uv run black src/ tests/
uv run mypy src/
uv run ruff check src/ tests/
uv run pytest
๐ License
MIT License - see LICENSE file for details
๐ References
Original Projects
- ipdelete/mcp-code-execution - Anthropic's PRIMARY pattern
- elusznik/mcp-server-code-execution-mode - Production security
MCP Resources
Python Resources
๐ Features Comparison
| Feature | Original (ipdelete) | Bridge (elusznik) | Enhanced (this) |
|---|---|---|---|
| Progressive Disclosure | โ PRIMARY | โ ๏ธ ALTERNATIVE | โ PRIMARY |
| Token Reduction | 98.7% | ~95% | 99.6% |
| Type Safety | โ Pydantic | โ ๏ธ Basic | โ Enhanced |
| Sandboxing | โ None | โ Required | โ Optional |
| Multi-Transport | โ stdio only | โ stdio only | โ stdio/SSE/HTTP |
| Scripts Pattern | โ None | โ None | โ Yes + examples |
| CLI Execution | โ None | โ None | โ Immutable |
| Test Coverage | โ ๏ธ Partial | โ ๏ธ Partial | โ Comprehensive |
| Python 3.11 | โ Yes | โ ๏ธ 3.12+ | โ Stable |
๐ก Use Cases
Perfect For
- โ AI agents needing to orchestrate multiple MCP tools
- โ Research workflows (web search โ read โ synthesize)
- โ Data processing pipelines (fetch โ transform โ output)
- โ Code discovery (search โ analyze โ recommend)
- โ Production deployments requiring security isolation
- โ Teams needing reproducible research workflows
Not Ideal For
- โ Single tool calls (use MCP directly instead)
- โ Real-time interactive tools (better suited for direct integration)
- โ GUI applications (command-line focused)
๐ฆ Getting Started Checklist
- Install Python 3.11+ and uv
- Clone repository
- Run
uv sync - Create
mcp_config.jsonwith your MCP servers - Run
uv run mcp-generateto create wrappers - Try a skill:
uv run python -m runtime.harness scripts/simple_fetch.py --url "https://example.com" - Read
AGENTS.mdfor operational guide - Explore
scripts/for available workflows - Review
docs/for detailed documentation
โ FAQ
Q: Why Skills instead of writing scripts? A: Skills achieve 99.6% token reduction vs 98.7% for scripts, and execute 24x faster (5 sec vs 2 min). They're pre-tested, documented, and immutable.
Q: Can I use this without Claude Code? A: Yes, but with limitations. The core runtime (script writing, 98.7% reduction) works with any AI agent. The Scripts with CLI args (99.6% reduction) work for Claude Code's operational intelligence.
Q: Can I still write custom scripts? A: Yes! Scripts with CLI args are PREFERRED for common workflows (with Claude Code), but custom scripts are fully supported for novel use cases and other AI agents.
Q: What's the difference from the original projects? A: We merged the best of both (progressive disclosure + security), added CLI-based scripts pattern, multi-transport support, and refined the architecture.
Q: Why Python 3.11 instead of 3.14? A: anyio <4.9.0 has compatibility issues with Python 3.14's asyncio changes. 3.11 is stable and well-tested.
Q: Is sandboxing required? A: No, it's optional. Use direct mode for development (fast), sandbox mode for production (secure).
Q: How do I add my own MCP servers?
A: Add them to mcp_config.json, run uv run mcp-generate, and they're ready to use!
๐ฏ Next Steps
- Explore scripts:
ls scripts/andcat scripts/simple_fetch.py - Try examples: Run the example skills or create your own
- Read CLAUDE.md: Quick operational guide (for Claude Code users)
- Review docs/: Deep dive into architecture
- Create custom skill: Follow the template for your use case