Selectools Implementation Documentation

June 13, 2026 · View on GitHub

Version: 0.16.6 Last Updated: March 2026

Welcome to the comprehensive technical documentation for selectools - a production-ready Python framework for building AI agents with tool-calling capabilities and RAG support.

📚 Documentation Structure

Getting Started

QUICKSTART.md - Start here. Build your first agent in 5 minutes, no API key needed.

Main Documentation

ARCHITECTURE.md - Complete system overview, architecture diagrams, data flows, and design principles.

Reference

KEYS.md - Environment variables and API keys for all providers
RELEASE_GUIDE.md - PyPI release process and checklist

Module Documentation

Detailed technical documentation for each module:

AGENT.md - Agent loop, structured output, traces, reasoning, batch, policy, observer, caching
STREAMING.md - E2E streaming, parallel execution, routing mode, AgentResult, context propagation
TOOLS.md - Tool definition, validation, registry, and streaming
DYNAMIC_TOOLS.md - ToolLoader, dynamic tool loading, hot-reload, plugin systems
PARSER.md - TOOL_CALL contract and JSON extraction strategies
PROMPT.md - System prompt generation and tool schema formatting
PROVIDERS.md - LLM provider adapters, message formatting, and FallbackProvider
MEMORY.md - Conversation memory, sliding windows, and tool-pair-aware trimming
USAGE.md - Usage tracking, cost calculation, and analytics
RAG.md - Complete RAG pipeline from documents to answers
HYBRID_SEARCH.md - BM25, hybrid search, fusion methods, and reranking
ADVANCED_CHUNKING.md - Semantic and contextual document chunking
EMBEDDINGS.md - Embedding providers and semantic search
VECTOR_STORES.md - Vector database implementations
MODELS.md - 115 models across 5 providers with June 2026 pricing
GUARDRAILS.md - Input/output validation pipeline, PII redaction, topic blocking
AUDIT.md - JSONL audit logging with privacy controls
SECURITY.md - Tool output screening and coherence checking
TOOLBOX.md - 56 pre-built tools across 19 categories (file, web, data, datetime, text, code, search, and more)
EXCEPTIONS.md - Error hierarchy, exception attributes, catch patterns
SESSIONS.md - Persistent session storage with 4 backends
ENTITY_MEMORY.md - Named entity extraction and tracking
KNOWLEDGE_GRAPH.md - Relationship triple extraction and graph memory
KNOWLEDGE.md - Cross-session knowledge with daily logs and persistent facts

By Role

For Developers:

Start: ARCHITECTURE.md
Building agents: AGENT.md
Creating tools: TOOLS.md
Pre-built toolbox: TOOLBOX.md
Dynamic tools & plugins: DYNAMIC_TOOLS.md
Adding RAG: RAG.md
Error handling: EXCEPTIONS.md

For Contributors:

Adding providers: PROVIDERS.md
Adding vector stores: VECTOR_STORES.md
Understanding parser: PARSER.md

For DevOps/Production:

Cost tracking: USAGE.md
Model selection: MODELS.md
Monitoring: AGENT.md (AgentObserver + AsyncAgentObserver), and result.trace.to_otel_spans() for OpenTelemetry
Guardrails & safety: GUARDRAILS.md
Audit logging: AUDIT.md
Prompt injection defence: SECURITY.md

By Feature

Tool Calling:

AGENT.md - Orchestration
TOOLS.md - Definition
DYNAMIC_TOOLS.md - Dynamic loading, plugins, hot-reload
PARSER.md - Parsing
PROMPT.md - Prompting

RAG System:

RAG.md - Overview
HYBRID_SEARCH.md - Hybrid search & reranking
ADVANCED_CHUNKING.md - Semantic & contextual chunking
EMBEDDINGS.md - Vector generation
VECTOR_STORES.md - Storage

Security & Compliance:

GUARDRAILS.md - Input/output validation pipeline
SECURITY.md - Tool output screening & coherence checking
AUDIT.md - JSONL audit trail with privacy controls

Memory & Persistence:

SESSIONS.md - Persistent session storage
ENTITY_MEMORY.md - Entity extraction and tracking
KNOWLEDGE_GRAPH.md - Relationship triple graph
KNOWLEDGE.md - Cross-session knowledge memory

Streaming & Performance:

STREAMING.md - E2E streaming, parallel execution, routing mode

Cost Management & Caching:

USAGE.md - Tracking
MODELS.md - Pricing
AGENT.md - Response caching

📊 Documentation Stats

Total files: 25 (1 main + 24 modules)
ASCII diagrams: 30+ diagrams
Code examples: 250+ examples

🔍 Understanding the Flow

Standard Tool-Calling Flow

1. User Query
   ↓
2. INPUT GUARDRAILS validate/redact user message (PII, topic, toxicity)
   ↓
3. AGENT loads history (MEMORY) and calls PROVIDER (or FALLBACK chain)
   ↓
4. CACHE checked (if configured) → hit? Return cached response
   ↓
5. PROVIDER formats prompt (PROMPT + STRUCTURED schema) and calls LLM → CACHE stores result
   ↓
6. OUTPUT GUARDRAILS validate LLM response (format, length, toxicity)
   ↓
7. PARSER extracts TOOL_CALL from response; REASONING extracted
   ↓
8. POLICY ENGINE evaluates tool call → allow/review/deny
   ↓
8b. If review: HUMAN-IN-THE-LOOP callback → approve/reject
   ↓
9. COHERENCE CHECK verifies tool call matches user intent
   ↓
10. TOOLS validates and executes (parallel if multiple)
   ↓
11. OUTPUT SCREENING checks tool results for prompt injection
   ↓
12. TRACE records each step; AUDIT LOGGER writes JSONL; USAGE tracks costs
   ↓
13. If response_format: STRUCTURED validates → retry on failure
   ↓
14. Loop continues or returns AgentResult (with .parsed, .trace, .reasoning)

Read:

AGENT.md - Main loop
PROVIDERS.md - LLM communication
PARSER.md - Response parsing
TOOLS.md - Tool execution

RAG Flow

1. Documents
   ↓
2. LOADERS read files/PDFs
   ↓
3. CHUNKING splits into pieces
   (TextSplitter → Recursive → Semantic → Contextual)
   ↓
4. EMBEDDINGS generate vectors
   ↓
5. VECTOR_STORES persist
   ↓
6. Query → Hybrid Search (Vector + BM25) → Fusion → Rerank → Answer

Read:

RAG.md - Complete pipeline
ADVANCED_CHUNKING.md - Semantic & contextual chunking
EMBEDDINGS.md - Vector generation
VECTOR_STORES.md - Storage
HYBRID_SEARCH.md - BM25, hybrid search & reranking

🎓 Learning Path

Beginner

Follow QUICKSTART.md - Build your first agent in 5 minutes
Read ARCHITECTURE.md - Get the big picture
Read AGENT.md - Understand the core loop
Read TOOLS.md - Learn to create tools

Intermediate

Read PARSER.md - Understand parsing
Read PROVIDERS.md - Switch providers
Read MEMORY.md - Add conversations
Read USAGE.md - Track costs

Advanced

Read RAG.md - Add document search
Read HYBRID_SEARCH.md - Hybrid search & reranking
Read ADVANCED_CHUNKING.md - Semantic & contextual chunking
Read STREAMING.md - Streaming, parallel execution, routing
Read DYNAMIC_TOOLS.md - Plugin systems & hot-reload

Production / Enterprise

Read GUARDRAILS.md - Input/output validation pipeline
Read AUDIT.md - Compliance logging
Read SECURITY.md - Prompt injection defence

Memory & Persistence

Read SESSIONS.md - Persistent sessions
Read ENTITY_MEMORY.md - Entity tracking
Read KNOWLEDGE_GRAPH.md - Knowledge graphs
Read KNOWLEDGE.md - Cross-session knowledge
Build production RAG and agent systems!

💡 Key Concepts

Design Principles

Provider Agnosticism - Switch LLMs without code changes
Library-First - Composable, no framework lock-in
Production Hardened - Retries, timeouts, validation
Developer Friendly - Type hints, decorators, clear errors
Observable - AgentObserver + AsyncAgentObserver protocol, OTel span export, usage tracking
Cost Aware - Automatic tracking and warnings
Performance Optimized - Parallel tool execution, response caching, async-first design
Enterprise Secure - Guardrails, PII redaction, prompt injection screening, coherence checking, audit logging

Core Patterns

Agent Loop - Iterative tool calling until completion
Structured Output - Pydantic/JSON Schema validation with auto-retry
Execution Traces - Structured timeline on every run (result.trace)
Reasoning Visibility - Why the agent chose a tool (result.reasoning)
Provider Fallback - Priority-ordered providers with circuit breaker
Batch Processing - Concurrent multi-prompt execution
Tool Policy Engine - Declarative allow/review/deny with HITL approval
Native Tool Calling - Provider-native function calling APIs
Parallel Execution - Concurrent tool calls via asyncio.gather
Tool Calling Contract - TOOL_CALL with JSON payload
Schema Generation - Automatic from type hints
Injected Parameters - Hide secrets from LLM
Streaming - Progressive results via generators
Response Caching - LRU+TTL caching for identical LLM requests
RAG Pipeline - Load → Chunk → Embed → Store → Search
Hybrid Search - BM25 + vector fusion with optional reranking
Semantic Chunking - Embedding-based topic-boundary splitting
Contextual Chunking - LLM-enriched chunks for better retrieval
Dynamic Tool Loading - Plugin discovery, hot-reload, runtime tool management
Routing Mode - Tool selection without execution for intent classification
Guardrails Engine - Input/output content validation with block/rewrite/warn actions
Audit Logging - JSONL audit trail with privacy controls and daily rotation
Tool Output Screening - Pattern-based prompt injection detection
Coherence Checking - LLM-based intent verification for tool calls

🔧 Common Tasks

Create an Agent

Define a Tool

Use Pre-Built Tools

See: TOOLBOX.md

Handle Errors

See: EXCEPTIONS.md

Look Up Model Pricing

See: MODELS.md — Programmatic Pricing API

Add RAG

Get Structured Output

See: AGENT.md — Structured Output

See Execution Traces

See: AGENT.md — Execution Traces

Add Provider Fallback

See: PROVIDERS.md — FallbackProvider

Batch Processing

See: AGENT.md — Batch Processing

Add Tool Policies

See: AGENT.md — Tool Policy

Add Guardrails

See: GUARDRAILS.md

Add Audit Logging

Screen Tool Outputs for Injection

See: SECURITY.md — Tool Output Screening

Enable Coherence Checking

See: SECURITY.md — Coherence Checking

Monitor with AgentObserver

See: AGENT.md — AgentObserver Protocol

Export Traces to OpenTelemetry

See: AGENT.md — AgentObserver Protocol (result.trace.to_otel_spans())

Switch Providers

See: PROVIDERS.md

Add Hybrid Search

See: HYBRID_SEARCH.md

Use Advanced Chunking

See: ADVANCED_CHUNKING.md

Stream Responses

See: STREAMING.md

Load Tools Dynamically

See: DYNAMIC_TOOLS.md

Track Costs

Choose a Model

🚀 Next Steps

Read the ARCHITECTURE.md for system overview
Explore module docs based on your needs
Check the main README for quick start examples
Review the Roadmap for upcoming features

📝 Contributing

Found an error or want to improve the docs?

Check the source code in src/selectools/
Submit issues or PRs on GitHub
Follow the patterns established in existing docs

Built with ❤️ for developers who want to understand their tools.