Middleware System Architecture

February 14, 2026 · View on GitHub

Overview

NeuroLink's middleware system provides a powerful and flexible way to intercept, modify, and enhance AI requests and responses. Middleware enables you to implement cross-cutting concerns like authentication, logging, analytics, content filtering, and auto-evaluation without modifying your core application logic.

Why Middleware Matters:

  • Request Interception: Modify requests before they reach the AI provider
  • Response Processing: Transform, filter, or validate AI responses
  • Cross-Cutting Concerns: Implement authentication, logging, rate limiting, and caching in a centralized way
  • Composability: Chain multiple middleware components together
  • Separation of Concerns: Keep business logic separate from infrastructure concerns

Key Benefits:

  • Production-ready middleware for common use cases (analytics, guardrails, auto-evaluation)
  • Factory pattern for easy middleware management
  • Priority-based execution ordering
  • Provider-specific conditional execution
  • Built on top of Vercel AI SDK's middleware system

Architecture Diagram

┌─────────────────────────────────────────────────────────────────┐
│                        Request Flow                              │
└─────────────────────────────────────────────────────────────────┘

  Client Request

       ├─────────────────────────────────────────────┐
       │                                             │
       v                                             │
┌──────────────────────┐                            │
│  MiddlewareFactory   │                            │
│  - Registry          │                            │
│  - Configuration     │                            │
└──────────────────────┘                            │
       │                                             │
       v                                             │
┌─────────────────────────────────────────┐         │
│    Pre-Request Middleware Chain          │         │
│  (Ordered by Priority - High to Low)    │         │
├─────────────────────────────────────────┤         │
│  1. transformParams (Guardrails)        │         │
│     - Precall evaluation                │         │
│     - Input validation                  │         │
│     - Request transformation            │         │
└─────────────────────────────────────────┘         │
       │                                             │
       v                                             │
┌─────────────────────────────────────────┐         │
│         Provider Execution               │         │
│    (OpenAI, Anthropic, Vertex, etc.)    │         │
└─────────────────────────────────────────┘         │
       │                                             │
       v                                             │
┌─────────────────────────────────────────┐         │
│   Post-Response Middleware Chain         │         │
│  (Ordered by Priority - High to Low)    │         │
├─────────────────────────────────────────┤         │
│  2. wrapGenerate/wrapStream             │         │
│     - Analytics (Priority: 100)         │         │
│     - Guardrails (Priority: 90)         │         │
│     - Auto-Evaluation (Priority: 90)    │         │
└─────────────────────────────────────────┘         │
       │                                             │
       v                                             │
  Client Response                                    │

┌─────────────────────────────────────────┐         │
│          Error Handling Flow             │         │
│    (If error occurs at any stage)       │◄────────┘
├─────────────────────────────────────────┤
│  - Error Middleware Chain               │
│  - Error logging                        │
│  - Fallback handling                    │
│  - Retry logic (if configured)          │
└─────────────────────────────────────────┘

       v
  Error Response

Request Lifecycle

The middleware system processes requests through four distinct phases:

Phase 1: Pre-Request (transformParams)

Middleware in this phase runs before the AI provider call, allowing you to:

  • Validate input: Check request parameters for validity
  • Authenticate/Authorize: Verify user permissions
  • Transform requests: Modify or enrich request parameters
  • Apply guardrails: Block requests with unsafe content using precall evaluation
  • Rate limiting: Enforce request quotas

Example Use Cases:

  • Precall guardrails evaluation (blocking unsafe prompts)
  • Request parameter validation
  • Adding authentication context
  • Modifying prompts based on user preferences
transformParams: async ({ params }) => {
  // Pre-request logic here
  console.log("Request received:", params);

  // Can modify params before they reach the provider
  return {
    ...params,
    temperature: Math.min(params.temperature || 0.7, 1.0),
  };
};

Phase 2: Provider Execution

The actual AI provider call happens between middleware phases:

  • Request sent to configured provider (OpenAI, Anthropic, Vertex, etc.)
  • Provider processes the request
  • Response received from provider

This phase is not middleware - it's the core AI operation that middleware wraps around.

Phase 3: Post-Response (wrapGenerate/wrapStream)

Middleware in this phase runs after the AI provider responds, allowing you to:

  • Collect analytics: Track token usage, response times, costs
  • Filter content: Apply guardrails to block/redact unsafe responses
  • Evaluate quality: Auto-evaluate response quality and trigger retries
  • Transform responses: Modify or enrich the response
  • Cache results: Store responses for future use

Example Use Cases:

  • Analytics and metrics collection
  • Content filtering and safety checks
  • Response quality evaluation
  • Response caching
  • Logging and auditing
wrapGenerate: async ({ doGenerate, params }) => {
  const startTime = Date.now();

  // Execute the provider call
  const result = await doGenerate();

  // Post-response logic here
  const responseTime = Date.now() - startTime;
  console.log(`Response in ${responseTime}ms`);

  return result;
};

Phase 4: Error Handling

If an error occurs at any stage, error handling middleware can:

  • Log errors: Record error details for debugging
  • Transform errors: Convert provider errors to user-friendly messages
  • Implement fallbacks: Retry with different providers
  • Alert monitoring: Send alerts to monitoring systems

Example Use Cases:

  • Error logging and tracking
  • Provider fallback on failure
  • Retry logic with exponential backoff
  • User-friendly error messages

Middleware Chain

Execution Order

Middleware executes in priority order, where higher priority values run first:

Priority 100: Analytics (runs first)
Priority 90:  Guardrails
Priority 90:  Auto-Evaluation (runs last among same priority)

Important Notes:

  • transformParams runs before wrapGenerate/wrapStream
  • Within the same priority, registration order determines execution
  • Middleware can be conditionally enabled based on provider, model, or custom logic

Chain Configuration

Configure which middleware to enable and their order:

import { MiddlewareFactory } from "@juspay/neurolink";

const factory = new MiddlewareFactory({
  // Use a preset for common configurations
  preset: "all", // Enables analytics + guardrails

  // Or explicitly enable specific middleware
  enabledMiddleware: ["analytics", "guardrails"],

  // Or configure each middleware individually
  middlewareConfig: {
    analytics: {
      enabled: true,
      config: { collectTokenUsage: true },
    },
    guardrails: {
      enabled: true,
      config: {
        badWords: ["prohibited", "blocked"],
        precallEvaluation: { enabled: true },
      },
    },
  },
});

Available Presets

PresetMiddleware EnabledUse Case
defaultAnalytics onlyBasic usage tracking
allAnalytics + GuardrailsProduction with safety
securityGuardrails onlySecurity-focused
CustomYour choiceDefine your own

Factory Pattern

MiddlewareFactory Class

The MiddlewareFactory is the central component for managing middleware:

class MiddlewareFactory {
  // Public registry for middleware management
  public registry: MiddlewareRegistry;

  // Available presets
  public presets: Map<string, MiddlewarePreset>;

  // Constructor
  constructor(options?: MiddlewareFactoryOptions);

  // Register custom middleware
  register(
    middleware: NeuroLinkMiddleware,
    options?: RegistrationOptions,
  ): void;

  // Register a preset
  registerPreset(preset: MiddlewarePreset, replace?: boolean): void;

  // Apply middleware to a language model
  applyMiddleware(
    model: LanguageModelV1,
    context: MiddlewareContext,
    options?: MiddlewareFactoryOptions,
  ): LanguageModelV1;

  // Create middleware context
  createContext(
    provider: string,
    model: string,
    options?: Record<string, unknown>,
    session?: { sessionId?: string; userId?: string },
  ): MiddlewareContext;

  // Validate middleware configuration
  validateConfig(config: Record<string, MiddlewareConfig>): ValidationResult;

  // Get available presets
  getAvailablePresets(): PresetInfo[];

  // Get middleware chain statistics
  getChainStats(
    context: MiddlewareContext,
    config: Record<string, MiddlewareConfig>,
  ): MiddlewareChainStats;
}

Creating Middleware Instances

Basic Usage:

import { MiddlewareFactory } from "@juspay/neurolink";

// Create factory with default preset (analytics enabled)
const factory = new MiddlewareFactory();

// Create context
const context = factory.createContext("openai", "gpt-4", { temperature: 0.7 });

// Apply middleware to a model
const wrappedModel = factory.applyMiddleware(baseModel, context);

Advanced Configuration:

import { MiddlewareFactory } from "@juspay/neurolink";
import { createAnalyticsMiddleware } from "@juspay/neurolink";

// Create factory with custom configuration
const factory = new MiddlewareFactory({
  preset: "all",
  middlewareConfig: {
    analytics: {
      enabled: true,
      config: {
        collectTokenUsage: true,
        collectTiming: true,
      },
    },
    guardrails: {
      enabled: true,
      config: {
        badWords: ["unsafe", "prohibited"],
        precallEvaluation: {
          enabled: true,
          provider: "openai",
          model: "gpt-4",
        },
      },
      conditions: {
        providers: ["openai", "anthropic"], // Only apply to specific providers
      },
    },
  },
});

// Or register custom middleware after instantiation
const customMiddleware = createMyCustomMiddleware();
factory.register(customMiddleware);

Registry System

Registering Middleware

The MiddlewareRegistry manages all registered middleware:

class MiddlewareRegistry {
  // Register a middleware
  register(
    middleware: NeuroLinkMiddleware,
    options?: MiddlewareRegistrationOptions,
  ): void;

  // Unregister a middleware
  unregister(middlewareId: string): boolean;

  // Get a registered middleware
  get(middlewareId: string): NeuroLinkMiddleware | undefined;

  // List all registered middleware
  list(): NeuroLinkMiddleware[];

  // Get middleware IDs sorted by priority
  getSortedIds(): string[];

  // Build middleware chain based on configuration
  buildChain(
    context: MiddlewareContext,
    config?: Record<string, MiddlewareConfig>,
  ): LanguageModelV1Middleware[];

  // Get execution statistics
  getExecutionStats(middlewareId: string): MiddlewareExecutionResult[];

  // Get aggregated statistics for all middleware
  getAggregatedStats(): Record<string, MiddlewareStats>;

  // Clear execution statistics
  clearStats(middlewareId?: string): void;

  // Check if middleware is registered
  has(middlewareId: string): boolean;

  // Get number of registered middleware
  size(): number;

  // Clear all registered middleware
  clear(): void;
}

Registration Example:

import { MiddlewareFactory } from "@juspay/neurolink";

const factory = new MiddlewareFactory();

// Register middleware with options
factory.register(myCustomMiddleware, {
  replace: false, // Error if already exists
  defaultEnabled: true, // Enable by default
  globalConfig: {
    // Global configuration
    logLevel: "debug",
  },
});

Discovering Middleware

List all registered middleware:

const allMiddleware = factory.registry.list();
console.log(
  "Registered middleware:",
  allMiddleware.map((m) => m.metadata.id),
);

Get specific middleware:

const analytics = factory.registry.get("analytics");
if (analytics) {
  console.log("Analytics middleware found:", analytics.metadata.name);
}

Check if middleware is registered:

if (factory.registry.has("guardrails")) {
  console.log("Guardrails middleware is available");
}

Middleware Metadata

Every middleware must provide metadata:

type NeuroLinkMiddlewareMetadata = {
  // Unique identifier
  id: string;

  // Human-readable name
  name: string;

  // Description of what this middleware does
  description?: string;

  // Execution priority (higher runs first)
  priority?: number;

  // Whether this middleware is enabled by default
  defaultEnabled?: boolean;
};

Example:

const metadata: NeuroLinkMiddlewareMetadata = {
  id: "my-custom-middleware",
  name: "My Custom Middleware",
  description: "Logs all requests and responses",
  priority: 50, // Run after analytics (100) but before auto-eval (90)
  defaultEnabled: false, // Require explicit enabling
};

TypeScript Interfaces

NeuroLinkMiddleware

The core middleware interface that combines AI SDK middleware with metadata:

import type { LanguageModelV1Middleware } from "ai";

type NeuroLinkMiddleware = LanguageModelV1Middleware & {
  // Metadata about this middleware
  metadata: NeuroLinkMiddlewareMetadata;
};

LanguageModelV1Middleware (from AI SDK)

The underlying middleware interface from Vercel AI SDK:

type LanguageModelV1Middleware = {
  // Transform request parameters before provider call
  transformParams?: (options: {
    params: LanguageModelV1CallOptions;
  }) => PromiseLike<LanguageModelV1CallOptions>;

  // Wrap generate() calls
  wrapGenerate?: (options: {
    doGenerate: () => PromiseLike<LanguageModelV1CallResult>;
    params: LanguageModelV1CallOptions;
  }) => PromiseLike<LanguageModelV1CallResult>;

  // Wrap stream() calls
  wrapStream?: (options: {
    doStream: () => PromiseLike<LanguageModelV1StreamResult>;
    params: LanguageModelV1CallOptions;
  }) => PromiseLike<LanguageModelV1StreamResult>;
};

MiddlewareContext

Context information passed to middleware:

type MiddlewareContext = {
  // Provider name (e.g., "openai", "anthropic")
  provider: string;

  // Model name (e.g., "gpt-4", "claude-3-5-sonnet")
  model: string;

  // Additional options
  options: Record<string, unknown>;

  // Session information
  session?: {
    sessionId?: string;
    userId?: string;
  };

  // Request metadata
  metadata: {
    timestamp: number;
    requestId: string;
  };
};

MiddlewareConfig

Configuration for individual middleware:

type MiddlewareConfig = {
  // Whether this middleware is enabled
  enabled: boolean;

  // Middleware-specific configuration
  config?: Record<string, unknown>;

  // Conditions for when this middleware should run
  conditions?: {
    // Only run for specific providers
    providers?: string[];

    // Only run for specific models
    models?: string[];

    // Only run when options match
    options?: Record<string, unknown>;

    // Custom condition function
    custom?: (context: MiddlewareContext) => boolean;
  };
};

MiddlewareFactoryOptions

Options for creating and configuring the factory:

type MiddlewareFactoryOptions = {
  // Preset to use (e.g., "default", "all", "security")
  preset?: string;

  // Custom middleware to register
  middleware?: NeuroLinkMiddleware[];

  // Configuration for each middleware
  middlewareConfig?: Record<string, MiddlewareConfig>;

  // List of middleware IDs to enable
  enabledMiddleware?: string[];

  // List of middleware IDs to disable
  disabledMiddleware?: string[];
};

MiddlewareChainStats

Statistics about middleware execution:

type MiddlewareChainStats = {
  // Total middleware in chain
  totalMiddleware: number;

  // Number of middleware actually applied
  appliedMiddleware: number;

  // Total execution time across all middleware
  totalExecutionTime: number;

  // Per-middleware execution results
  results: Record<string, MiddlewareExecutionResult>;
};

type MiddlewareExecutionResult = {
  // Whether middleware was applied
  applied: boolean;

  // Execution time in milliseconds
  executionTime: number;

  // Error if execution failed
  error?: Error;
};

Conditional Execution

Middleware can be configured to run only under specific conditions:

Provider-Specific Middleware

factory.applyMiddleware(model, context, {
  middlewareConfig: {
    guardrails: {
      enabled: true,
      conditions: {
        providers: ["openai", "anthropic"], // Only for these providers
      },
    },
  },
});

Model-Specific Middleware

factory.applyMiddleware(model, context, {
  middlewareConfig: {
    analytics: {
      enabled: true,
      conditions: {
        models: ["gpt-4", "claude-3-5-sonnet"], // Only for these models
      },
    },
  },
});

Custom Conditions

factory.applyMiddleware(model, context, {
  middlewareConfig: {
    myMiddleware: {
      enabled: true,
      conditions: {
        custom: (context) => {
          // Only run during business hours
          const hour = new Date().getHours();
          return hour >= 9 && hour <= 17;
        },
      },
    },
  },
});

Performance Monitoring

Execution Statistics

Track middleware performance:

// Get stats for specific middleware
const analyticsStats = factory.registry.getExecutionStats("analytics");
console.log("Analytics executions:", analyticsStats);

// Get aggregated stats for all middleware
const allStats = factory.registry.getAggregatedStats();
console.log("All middleware stats:", allStats);

Output Example:

{
  analytics: {
    totalExecutions: 1000,
    successfulExecutions: 998,
    failedExecutions: 2,
    averageExecutionTime: 2.5, // milliseconds
    lastExecutionTime: 2.3
  },
  guardrails: {
    totalExecutions: 1000,
    successfulExecutions: 950,
    failedExecutions: 50,
    averageExecutionTime: 15.2,
    lastExecutionTime: 14.8
  }
}

Clear Statistics

// Clear stats for specific middleware
factory.registry.clearStats("analytics");

// Clear all stats
factory.registry.clearStats();

Best Practices

1. Order Middleware by Priority

// Security first (highest priority)
// Analytics for all requests
// Evaluation last (lowest priority)

const securityMiddleware = {
  metadata: { id: "security", priority: 100 },
};

const analyticsMiddleware = {
  metadata: { id: "analytics", priority: 90 },
};

const evaluationMiddleware = {
  metadata: { id: "evaluation", priority: 80 },
};

2. Handle Errors Gracefully

wrapGenerate: async ({ doGenerate }) => {
  try {
    const result = await doGenerate();
    return result;
  } catch (error) {
    // Log error but don't break the chain
    console.error("Middleware error:", error);
    throw error; // Re-throw to maintain error flow
  }
};

3. Use Conditional Execution

// Only apply expensive middleware for production
middlewareConfig: {
  expensiveMiddleware: {
    enabled: true,
    conditions: {
      custom: (context) => process.env.NODE_ENV === "production"
    }
  }
}

4. Keep Middleware Focused

Each middleware should have a single responsibility:

  • ✅ Good: Analytics middleware only collects metrics
  • ❌ Bad: Analytics middleware that also filters content and logs errors

5. Test Middleware Independently

import { createAnalyticsMiddleware } from "@juspay/neurolink";

// Test middleware in isolation
const middleware = createAnalyticsMiddleware();
const mockDoGenerate = async () => ({ text: "test" });
const result = await middleware.wrapGenerate({
  doGenerate: mockDoGenerate,
  params: { prompt: "test" },
});

See Also