MCP Code Execution - Python Runtime
November 8, 2025 ยท View on GitHub
98.7% Token Reduction through progressive tool discovery for Model Context Protocol (MCP) servers.
Overview
This runtime enables AI agents to work with MCP tools through a progressive disclosure pattern:
- Agent explores
./servers/to discover available tools - Agent reads only needed tool definitions
- Agent writes Python script to fetch data via MCP tools
- Script returns results (raw or processed) - agent can then process/summarize in subsequent turns
Result: ~98.7% reduction in tokens sent to the agent.
Features
- ๐ฆฅ Lazy Loading: Servers connect only when tools are called
- ๐ Type Safety: Pydantic models for all tool inputs/outputs
- ๐ Defensive Coding: Handles variable MCP response structures
- ๐ฆ Auto-generated Wrappers: Typed Python functions from MCP schemas
- ๐ ๏ธ Field Normalization: Handles inconsistent API casing (e.g., ADO)
Installation
Prerequisites
- Python 3.11+
- uv package manager
- Node.js (for MCP servers)
Setup
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone repository
git clone https://github.com/ipdelete/mcp-code-execution.git
cd mcp-code-execution
# Install dependencies
uv sync --all-extras
# Install in editable mode
uv pip install -e ".[dev]"
Quick Start
1. Configure MCP Servers
Create mcp_config.json:
{
"mcpServers": {
"git": {
"command": "uvx",
"args": ["mcp-server-git", "--repository", "."]
},
"fetch": {
"command": "uvx",
"args": ["mcp-server-fetch"]
}
}
}
2. Generate Tool Wrappers and Discover Schemas (Optional)
# Generate wrappers from inputSchema
uv run mcp-generate
If your MCP servers are missing outputSchema definitions, automatically generate them:
# Step 1: Generate discovery config with LLM-powered test parameters
uv run mcp-generate-discovery
# Step 2: Review discovery_config.json and remove/modify as needed
# Step 3: Execute safe tools and infer schemas
uv run mcp-discover
This creates typed wrappers in servers/:
servers/
git/
__init__.py
git_status.py
git_log.py
git_diff.py
discovered_types.py # Generated output schemas
fetch/
__init__.py
fetch.py
discovered_types.py
3. How It Works
When you ask an AI agent to work with your data:
- Agent explores the available MCP tools via
./servers/ - Agent writes a script that uses
call_mcp_tool()to fetch data from MCP servers - Script returns data - either raw or pre-processed depending on the use case
- Agent processes results - can summarize, reshape, or use as input for subsequent tool calls
Key insight: Not all processing needs to happen in the script. The LLM can handle summarization and data transformation in follow-up interactions. Scripts focus on efficient data retrieval.
Example script the agent might write:
"""Analyze git repository commits."""
import asyncio
from runtime.mcp_client import call_mcp_tool
async def main():
# Fetch recent commits
result = await call_mcp_tool(
"git__git_log",
{"repo_path": ".", "max_count": 10}
)
# Return data for agent to process
# Agent can then summarize, analyze, or use as input to other tools
print(f"Fetched commit log")
return result
if __name__ == "__main__":
asyncio.run(main())
4. Agent Execution
When the agent needs to run a script, it uses the harness:
# The agent runs this automatically
uv run python -m runtime.harness workspace/my_script.py
# Or via the convenience alias
uv run mcp-exec workspace/my_script.py
The harness manages the MCP client lifecycle, connects to servers, and captures the script's output to send back to the agent.
Architecture
Progressive Disclosure Pattern
Traditional Approach (High Token Usage):
Agent โ MCP Server โ [Full Data 50KB] โ Agent processes all
Progressive Disclosure (98.7% Reduction):
Agent โ Discovers tools โ Writes script
Script โ MCP Server โ [Full Data 50KB] โ Returns to Agent
Agent โ Processes/summarizes โ Uses in follow-up calls
Key Components
runtime/mcp_client.py: Lazy-loading MCP client managerruntime/harness.py: Script execution environmentruntime/generate_wrappers.py: Auto-generate typed wrappersruntime/normalize_fields.py: Handle API field casingruntime/schema_utils.py: JSON Schema โ Pydantic conversion
Development
Running Tests
# All tests
uv run pytest
# Unit tests only
uv run pytest tests/unit/
# Integration tests
uv run pytest tests/integration/
# With coverage
uv run pytest --cov=src/runtime
Code Quality
# Type checking
uv run mypy src/
# Formatting
uv run black src/ tests/
# Linting
uv run ruff check src/ tests/
# Format check
uv run black --check src/ tests/
Project Scripts
# Generate wrappers from tool definitions
uv run mcp-generate
# (Optional) Generate discovery config with LLM parameter generation
uv run mcp-generate-discovery
# (Optional) Execute safe tools and infer schemas
uv run mcp-discover
# Execute a Python script with MCP tools available
uv run mcp-exec workspace/script.py
Python-Specific Features
Type Safety with Pydantic
All tool parameters and results are validated with Pydantic:
from runtime.servers.git import git_status, GitStatusParams
# Type-safe parameters
params = GitStatusParams(repo_path=".")
# Type-safe result
result = await git_status(params)
Async/Await Support
Built on asyncio for efficient I/O:
import asyncio
from runtime.mcp_client import call_mcp_tool
async def main():
# Concurrent tool calls
results = await asyncio.gather(
call_mcp_tool("git__git_status", {"repo_path": "."}),
call_mcp_tool("git__git_log", {"repo_path": ".", "max_count": 5}),
)
asyncio.run(main())
Field Normalization
Handles inconsistent API casing automatically:
from runtime.normalize_fields import normalize_field_names
# ADO returns lowercase, expects PascalCase
ado_response = {"system.title": "Task", "custom.priority": "High"}
normalized = normalize_field_names(ado_response, "ado")
# Result: {"System.title": "Task", "Custom.priority": "High"}
Handling Missing Output Schemas
Many MCP servers don't provide outputSchema in their tool definitions, which is optional in the MCP spec. This project provides automatic schema discovery using LLM-powered parameter generation:
How It Works
-
Generate Discovery Config (
mcp-generate-discovery)- Connects to all configured MCP servers
- Uses Claude to generate sensible test parameters from
inputSchema - Classifies tools as SAFE/DANGEROUS/UNKNOWN based on patterns
- Writes
discovery_config.jsonfor review
-
Review and Edit (Manual step)
- Review the generated config
- Add/remove tools as needed
- Modify test parameters if necessary
-
Discover Schemas (
mcp-discover)- Executes safe tools with test parameters
- Infers Pydantic models from actual responses
- Writes
servers/{server}/discovered_types.py
Tool Classification
Tools are automatically classified by safety:
- SAFE: Tools matching patterns like
get_*,list_*,read_*,fetch,search_*, etc. - DANGEROUS: Tools matching patterns like
delete_*,remove_*,update_*,write_*, etc. - UNKNOWN: Tools that don't match any pattern (require manual review)
Dangerous tools are excluded from auto-discovery by default.
Example
{
"servers": {
"github": {
"safeTools": {
"search_code": {"q": "language:python", "per_page": 1},
"list_repositories": {"sort": "stars", "per_page": 1}
}
}
},
"metadata": {
"generated": true,
"generated_count": 2,
"skipped_count": 3,
"tools_skipped": {
"dangerous": ["delete_repository"],
"unknown": ["analyze_code", "deploy_release", "configure_webhook"]
}
}
}
Examples
See workspace/example_progressive_disclosure.py for a complete example demonstrating the token reduction pattern.
Contributing
# Install dev dependencies
uv sync --all-extras
# Run quality checks before committing
uv run black src/ tests/
uv run mypy src/
uv run ruff check src/ tests/
uv run pytest
License
MIT