Getting Started

December 14, 2025 · View on GitHub

This guide walks you through creating tools and using the processor.

Table of Contents


Creating Tools

CHUK Tool Processor supports multiple patterns for defining tools.

Simple Class-Based Tools

The simplest way to create a tool:

from chuk_tool_processor import create_registry

class Calculator:
    async def execute(self, operation: str, a: float, b: float) -> dict:
        ops = {"add": a + b, "multiply": a * b, "subtract": a - b}
        return {"result": ops.get(operation, 0)}

# Register with dotted name (auto-parsed to namespace="math", name="calculator")
registry = create_registry()
await registry.register_tool(Calculator, name="math.calculator")

Dotted Names for Namespacing

Dotted names are auto-parsed into namespace and tool name:

# These are equivalent:
await registry.register_tool(MyTool, name="web.fetch_user")           # Auto-parsed
await registry.register_tool(MyTool, name="fetch_user", namespace="web")  # Explicit

Benefits:

  • Logical grouping: web.*, db.*, compute.*
  • Pattern-based limits: Configure bulkheads like "db.*": 3 for all database tools
  • Namespace isolation: Scoped registries per tenant or environment

The @tool decorator also supports dotted names:

@tool(name="web.fetch_user")  # Parsed to namespace="web", name="fetch_user"
class FetchUserTool:
    async def execute(self, user_id: str) -> dict:
        return {"user_id": user_id}

Function-Based Tools

Register a plain function as a tool:

from chuk_tool_processor import register_fn_tool
from datetime import datetime
from zoneinfo import ZoneInfo

def get_current_time(timezone: str = "UTC") -> str:
    """Get the current time in the specified timezone."""
    now = datetime.now(ZoneInfo(timezone))
    return now.strftime("%Y-%m-%d %H:%M:%S %Z")

# Register the function as a tool (sync — no await needed)
register_fn_tool(get_current_time, namespace="utilities")

ValidatedTool (Pydantic Type Safety)

For production tools, use Pydantic validation:

from chuk_tool_processor import tool
from chuk_tool_processor.models import ValidatedTool
from pydantic import BaseModel, Field

@tool(name="api.weather")  # Dotted name: namespace="api", name="weather"
class WeatherTool(ValidatedTool):
    class Arguments(BaseModel):
        location: str = Field(..., description="City name")
        units: str = Field("celsius", description="Temperature units")

    class Result(BaseModel):
        temperature: float
        conditions: str

    async def _execute(self, location: str, units: str) -> Result:
        # Your weather API logic here
        return self.Result(temperature=22.5, conditions="Sunny")

Benefits of ValidatedTool:

  • Automatic argument validation
  • Type coercion (string "5" → int 5)
  • Whitespace stripping
  • Extra fields ignored
  • Clear error messages
Alternative: Using @register_tool (still works)
from chuk_tool_processor import register_tool

@register_tool(name="weather")  # Longer form, but identical functionality
class WeatherTool(ValidatedTool):
    # ... same as above

StreamingTool (Real-time Results)

For long-running operations that produce incremental results:

from chuk_tool_processor import tool
from chuk_tool_processor.models import StreamingTool
from pydantic import BaseModel

@tool(name="io.file_processor")  # Dotted name: namespace="io", name="file_processor"
class FileProcessor(StreamingTool):
    class Arguments(BaseModel):
        file_path: str

    class Result(BaseModel):
        line: int
        content: str

    async def _stream_execute(self, file_path: str):
        with open(file_path) as f:
            for i, line in enumerate(f, 1):
                yield self.Result(line=i, content=line.strip())

Consuming streaming results:

import asyncio
from chuk_tool_processor import ToolProcessor, initialize

async def main():
    await initialize()
    processor = ToolProcessor()

    async for event in processor.astream(
        '<tool name="file_processor" args=\'{"file_path":"README.md"}\'/>'
    ):
        line = event.get("line") if isinstance(event, dict) else getattr(event, "line", None)
        content = event.get("content") if isinstance(event, dict) else getattr(event, "content", None)
        print(f"Line {line}: {content}")

        # Cancel after 100 lines
        if line and line > 100:
            break  # Cleanup happens automatically

asyncio.run(main())

Using the Processor

Basic Usage

Call await initialize() once at startup to load your registry. Use context managers for automatic cleanup:

import asyncio
from chuk_tool_processor import ToolProcessor, initialize

async def main():
    await initialize()

    # Context manager automatically handles cleanup
    async with ToolProcessor() as processor:
        # Discover available tools
        tools = await processor.list_tools()
        print(f"Available tools: {tools}")

        # Process LLM output
        llm_output = '<tool name="calculator" args=\'{"operation":"add","a":2,"b":3}\'/>'
        results = await processor.process(llm_output)

        for result in results:
            if result.error:
                print(f"Error: {result.error}")
            else:
                print(f"Success: {result.result}")

    # Processor automatically cleaned up here!

asyncio.run(main())

Production Configuration

from chuk_tool_processor import ToolProcessor, initialize
import asyncio

async def main():
    await initialize()

    async with ToolProcessor(
        # Execution settings
        default_timeout=30.0,
        max_concurrency=20,

        # Production features
        enable_caching=True,
        cache_ttl=600,
        enable_rate_limiting=True,
        global_rate_limit=100,
        enable_retries=True,
        max_retries=3
    ) as processor:
        results = await processor.process(llm_output)

    # Automatic cleanup on exit

asyncio.run(main())

Advanced Production Features

Circuit Breaker Pattern

Prevent cascading failures by automatically opening circuits for failing tools:

from chuk_tool_processor import ToolProcessor

processor = ToolProcessor(
    enable_circuit_breaker=True,
    circuit_breaker_threshold=5,      # Open after 5 failures
    circuit_breaker_timeout=60.0,     # Try recovery after 60s
)

Circuit states: CLOSED → OPEN → HALF_OPEN → CLOSED

StateBehavior
CLOSEDNormal operation
OPENBlocking requests (too many failures)
HALF_OPENTesting recovery with limited requests

How it works:

  1. Tool fails repeatedly (hits threshold)
  2. Circuit opens → requests blocked immediately
  3. After timeout, circuit enters HALF_OPEN
  4. If test requests succeed → circuit closes
  5. If test requests fail → back to OPEN

Idempotency Keys

Automatically deduplicate LLM tool calls using SHA256-based keys:

from chuk_tool_processor.models.tool_call import ToolCall

# Idempotency keys are auto-generated
call1 = ToolCall(tool="search", arguments={"query": "Python"})
call2 = ToolCall(tool="search", arguments={"query": "Python"})

# Same arguments = same idempotency key
assert call1.idempotency_key == call2.idempotency_key

# Used automatically by caching layer
processor = ToolProcessor(enable_caching=True)
results1 = await processor.process([call1])  # Executes
results2 = await processor.process([call2])  # Cache hit!

Benefits:

  • Prevents duplicate executions from LLM retries
  • Deterministic cache keys
  • No manual key management needed

Tool Schema Export

Export tool definitions to multiple formats for LLM prompting:

from chuk_tool_processor.models.tool_spec import ToolSpec
from chuk_tool_processor.models.validated_tool import ValidatedTool
from pydantic import BaseModel, Field

@register_tool(name="weather")
class WeatherTool(ValidatedTool):
    """Get current weather for a location."""

    class Arguments(BaseModel):
        location: str = Field(..., description="City name")

    class Result(BaseModel):
        temperature: float
        conditions: str

# Generate tool spec
spec = ToolSpec.from_validated_tool(WeatherTool)

# Export to different formats
openai_format = spec.to_openai()       # For OpenAI function calling
anthropic_format = spec.to_anthropic() # For Claude tools
mcp_format = spec.to_mcp()             # For MCP servers

Example OpenAI format:

{
  "type": "function",
  "function": {
    "name": "weather",
    "description": "Get current weather for a location.",
    "parameters": {
      "type": "object",
      "properties": {
        "location": {"type": "string", "description": "City name"}
      },
      "required": ["location"]
    }
  }
}

Machine-Readable Error Codes

Structured error handling with error codes for programmatic responses:

from chuk_tool_processor.core.exceptions import (
    ErrorCode,
    ToolNotFoundError,
    ToolTimeoutError,
    ToolCircuitOpenError,
)

try:
    results = await processor.process(llm_output)
except ToolNotFoundError as e:
    if e.code == ErrorCode.TOOL_NOT_FOUND:
        available = e.details.get("available_tools", [])
        print(f"Try one of: {available}")
except ToolTimeoutError as e:
    if e.code == ErrorCode.TOOL_TIMEOUT:
        timeout = e.details["timeout"]
        print(f"Tool timed out after {timeout}s")
except ToolCircuitOpenError as e:
    if e.code == ErrorCode.TOOL_CIRCUIT_OPEN:
        reset_time = e.details.get("reset_timeout")
        print(f"Service unavailable, retry in {reset_time}s")

# All errors include .to_dict() for logging
error_dict = e.to_dict()

Available error codes:

CodeDescription
TOOL_NOT_FOUNDTool doesn't exist in registry
TOOL_EXECUTION_FAILEDTool execution error
TOOL_TIMEOUTTool exceeded timeout
TOOL_CIRCUIT_OPENCircuit breaker is open
TOOL_RATE_LIMITEDRate limit exceeded
TOOL_VALIDATION_ERRORArgument validation failed
MCP_CONNECTION_FAILEDMCP server unreachable

See ERRORS.md for the complete error taxonomy.

LLM-Friendly Argument Coercion

ValidatedTool automatically coerces LLM outputs to correct types:

from chuk_tool_processor.models.validated_tool import ValidatedTool
from pydantic import BaseModel

class SearchTool(ValidatedTool):
    class Arguments(BaseModel):
        query: str
        limit: int = 10
        category: str = "all"

# LLM outputs often have quirks:
llm_output = {
    "query": "  Python tutorials  ",  # Extra whitespace
    "limit": "5",                      # String instead of int
    "unknown_field": "ignored"         # Extra field
}

# ValidatedTool automatically coerces and validates
tool = SearchTool()
result = await tool.execute(**llm_output)
# ✅ Works! Whitespace stripped, "5" → 5, extra field ignored

Coercion features:

  • str_strip_whitespace=True → Remove accidental whitespace
  • extra="ignore" → Ignore unknown fields
  • use_enum_values=True → Convert enums to values
  • Type coercion (string to int, etc.)

Next Steps