Production Notes
February 17, 2026 · View on GitHub
WARNING: This sample is NOT production-ready. This document explains what changes are needed before deploying to production.
TL;DR
- In-memory state loses data on restart
- No authentication or rate limiting
- No health check endpoints
- See deployment checklist below
Architecture Limitations
Figure 1: In-memory sample components vs production-ready replacements
| Component | Current State | Production Requirement |
|---|---|---|
| Session Storage | InMemorySessionService (lost on restart) | Redis or database |
| Task Storage | InMemoryTaskStore (lost on restart) | Persistent task store |
| Checkout Storage | RetailStore._checkouts dict (in-memory) | PostgreSQL or similar |
| Order Storage | RetailStore._orders dict (in-memory) | PostgreSQL or similar |
| Concurrency | No locking on _checkouts | Checkout-level locks |
| Authentication | None (trusts context_id) | JWT or API key validation |
| Secrets | Plaintext .env | Secret Manager (GCP/AWS) |
Key insight: The RetailStore class (in store.py) stores all business data in plain Python dictionaries:
self._checkouts = {}— All active checkout sessionsself._orders = {}— All completed ordersself._products— Product catalog (loaded from JSON)
These are lost on restart and have no concurrency protection.
Security Gaps
Figure 2: Four critical security vulnerabilities and their required fixes
1. No Authentication
Any client can claim any context_id. In production, validate user identity from JWTs or API keys.
# Current (insecure)
user_id = context.context_id # Trusts client-provided value
# Production
user_id = validate_jwt(context.headers.get("Authorization"))
2. Profile URL Not Validated
The UCP-Agent header can point to any URL. A malicious client could point to:
- Internal services (SSRF attack)
- Slow servers (DoS the agent startup)
# Production: Whitelist allowed domains
ALLOWED_DOMAINS = ["localhost", "trusted-partner.com"]
parsed = urlparse(client_profile_url)
if parsed.netloc not in ALLOWED_DOMAINS:
raise ValueError("Untrusted profile URL")
3. No Rate Limiting
Unlimited requests are accepted. Add rate limiting per user/IP.
4. Payment Data Unencrypted
Payment instruments are stored as plaintext in session state. If logs capture state, card data leaks.
# Production: Never log payment data
logger.info(f"Processing payment for checkout {checkout_id}")
# NOT: logger.info(f"Payment data: {payment_data}")
Concurrency Issues
Race Condition in Checkout
Between start_payment() and complete_checkout(), another request could:
- Add items (changing the total)
- Change the address (affecting tax)
- Call
start_payment()again
Fix: Add checkout-level locking.
# Production pattern
async def add_to_checkout(checkout_id: str, ...):
async with self.get_lock(checkout_id):
# Safe from concurrent modifications
checkout = self._checkouts[checkout_id]
checkout.line_items.append(...)
Duplicate Items on Retry
If a network error causes the client to retry add_to_checkout, items are duplicated.
Fix: Implement idempotency via client-provided keys.
def add_to_checkout(
tool_context: ToolContext,
product_id: str,
quantity: int,
idempotency_key: str | None = None # Client-provided
) -> dict:
if idempotency_key and idempotency_key in self.processed_requests:
return self.processed_requests[idempotency_key]
# ... proceed with add ...
Deployment Checklist
Required Before Production
Infrastructure:
- Create Dockerfile with non-root user
- Add
/healthendpoint (liveness probe) - Add
/readyendpoint (readiness probe) - Configure Kubernetes manifests or Cloud Run
State Management:
- Replace
InMemorySessionServicewith Redis-backed store - Replace
InMemoryTaskStorewith persistent store - Replace
RetailStore._checkoutsdict with database (PostgreSQL) - Replace
RetailStore._ordersdict with database - Add session TTL and cleanup
Security:
- Add request authentication (JWT/API key)
- Validate profile URLs against whitelist
- Move
GOOGLE_API_KEYto Secret Manager - Add rate limiting per user
Observability:
- Add structured JSON logging
- Add Prometheus metrics endpoint
- Add request tracing (OpenTelemetry)
Reliability:
- Add checkout-level locking
- Implement idempotency for state-changing operations
- Add circuit breaker for Gemini API calls
Scaling Considerations
Session Affinity
If running multiple replicas, all requests for a session must go to the same instance (or use shared storage).
# Kubernetes: Enable session affinity
spec:
sessionAffinity: ClientIP
LLM Call Blocking
Each runner.run_async() call blocks until the LLM responds (2-5 seconds typical). With synchronous execution, you need one thread per concurrent user.
Consider: Async runner with connection pooling.
Gemini API Rate Limits
The Gemini API has rate limits. Add a circuit breaker to fail fast when limits are hit.
Sample Dockerfile
FROM python:3.13-slim
# Create non-root user
RUN useradd -m agent
WORKDIR /app
# Copy files
COPY --chown=agent:agent . .
# Install dependencies
RUN pip install uv && uv sync
# Switch to non-root user
USER agent
# Expose port
EXPOSE 10999
# Health check
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s \
CMD curl -f http://localhost:10999/health || exit 1
# Run
CMD ["uv", "run", "business_agent"]
Sample Health Check Endpoint
Add to main.py:
@app.route("/health", methods=["GET"])
async def health(request: Request) -> JSONResponse:
return JSONResponse({"status": "healthy"})
@app.route("/ready", methods=["GET"])
async def ready(request: Request) -> JSONResponse:
# Check dependencies here
try:
# Verify store is accessible
_ = store.products
return JSONResponse({"status": "ready"})
except Exception as e:
return JSONResponse({"status": "not_ready", "error": str(e)}, status_code=503)
Environment Variables
Externalize these hardcoded values:
| Variable | Current | Purpose |
|---|---|---|
AGENT_HOST | localhost | Bind address |
AGENT_PORT | 10999 | Listen port |
LOG_LEVEL | INFO | Logging verbosity |
GOOGLE_API_KEY | .env file | Gemini API access |
SESSION_TTL | Unlimited | Session expiration (seconds) |
REDIS_URL | N/A | Distributed session storage |
Related Documentation
- Architecture - System design
- Testing Guide - Debugging and troubleshooting