Graph Router (GNN-Based Router)

December 29, 2025 · View on GitHub

Overview

The Graph Router uses Graph Neural Networks (GNNs) to make routing decisions by modeling queries and LLMs as nodes in a heterogeneous graph. It learns routing patterns by propagating information through the graph structure, capturing complex relationships between queries, LLMs, and their performance characteristics.

Paper Reference

This router implements the GraphRouter approach:

  • GraphRouter: A Graph-based Router for LLM Selections

    • (2024). arXiv:2410.03834.
    • Constructs heterogeneous graph with task, query, and LLM nodes for routing.
  • GNN Foundations: Kipf, T. N., & Welling, M. (2017). "Semi-supervised classification with graph convolutional networks." ICLR.

  • Application: Treats LLM routing as link prediction in a bipartite query-model graph.

How It Works

Graph Structure

Query Nodes ─── edges(performance) ──→ LLM Nodes

        GNN Message Passing

         Predictions

Node Types:

  • Query Nodes: Each query is a node with Longformer embedding features
  • LLM Nodes: Each LLM is a node with learned/provided embeddings
  • Edges: Connect queries to all LLMs, weighted by performance scores

Routing Mechanism

  1. Graph Construction:

    • Create bipartite graph: queries on one side, LLMs on the other
    • Add edges from each query to all LLMs
    • Edge features: performance scores (or 0 for new queries)
  2. GNN Forward Pass:

    • Aggregate information from neighboring nodes
    • Update node representations using message passing
    • Apply graph attention or convolution layers
  3. Prediction:

    • For each query-LLM edge, predict suitability score
    • Select LLM with highest predicted score

Training Strategy

Uses edge masking for training:

  • Mask a portion of edges (e.g., 30%)
  • Train GNN to predict performance on masked edges
  • Evaluation on validation set with different masked edges

Configuration Parameters

Training Hyperparameters (hparam in config)

ParameterTypeDefaultDescription
hidden_dimint64Hidden layer dimension for GNN. Controls model capacity. Range: 32-256.
learning_ratefloat0.001Learning rate for AdamW optimizer. Range: 0.0001-0.01.
weight_decayfloat0.0001L2 regularization weight decay. Prevents overfitting.
train_epochint100Number of training epochs. Increase for larger graphs.
batch_sizeint4Number of masked samples per gradient step.
train_mask_ratefloat0.3Fraction of edges to mask during training (0.0-1.0).
val_split_ratiofloat0.2Ratio of training data used for validation.
random_stateint42Random seed for reproducibility.

Data Paths

ParameterDescription
routing_data_trainTraining query-LLM performance data (JSONL)
query_embedding_dataPre-computed Longformer query embeddings (PyTorch tensor)
llm_dataLLM information with optional embeddings (JSON)

Model Paths

ParameterPurpose
save_model_pathWhere to save trained GNN model
load_model_pathModel to load for inference

CLI Usage

The Graph Router can be used via the llmrouter command-line interface:

Training

# Train the Graph router (GPU recommended)
llmrouter train --router graphrouter --config configs/model_config_train/graphrouter.yaml --device cuda

# Train with quiet mode
llmrouter train --router graphrouter --config configs/model_config_train/graphrouter.yaml --device cuda --quiet

Inference

# Route a single query
llmrouter infer --router graphrouter --config configs/model_config_test/graphrouter.yaml \
    --query "Explain quantum mechanics"

# Route queries from a file
llmrouter infer --router graphrouter --config configs/model_config_test/graphrouter.yaml \
    --input queries.jsonl --output results.json

# Route only (without calling LLM API)
llmrouter infer --router graphrouter --config configs/model_config_test/graphrouter.yaml \
    --query "What is machine learning?" --route-only

Interactive Chat

# Launch chat interface
llmrouter chat --router graphrouter --config configs/model_config_test/graphrouter.yaml

# Launch with custom port
llmrouter chat --router graphrouter --config configs/model_config_test/graphrouter.yaml --port 8080

# Create a public shareable link
llmrouter chat --router graphrouter --config configs/model_config_test/graphrouter.yaml --share

Usage Examples

Training

from llmrouter.models import GraphRouter, GraphRouterTrainer

router = GraphRouter(yaml_path="configs/model_config_train/graphrouter.yaml")
trainer = GraphRouterTrainer(router=router, device="cuda")
trainer.train()

Inference

from llmrouter.models import GraphRouter

router = GraphRouter(yaml_path="configs/model_config_test/graphrouter.yaml")
query = {"query": "Explain quantum mechanics"}
result = router.route_single(query)
print(f"Selected: {result['model_name']}")

YAML Configuration Example

data_path:
  routing_data_train: 'data/example_data/routing_data/default_routing_train_data.jsonl'
  query_embedding_data: 'data/example_data/routing_data/query_embeddings_longformer.pt'
  llm_data: 'data/example_data/llm_candidates/default_llm.json'

model_path:
  save_model_path: 'saved_models/graphrouter/graphrouter.pt'

hparam:
  hidden_dim: 64
  learning_rate: 0.001
  weight_decay: 0.0001
  train_epoch: 100
  batch_size: 4
  train_mask_rate: 0.3
  val_split_ratio: 0.2

metric:
  weights:
    performance: 1

Advantages

  • Relational Learning: Captures complex query-model relationships
  • Graph Structure: Leverages network effects and transitivity
  • Flexible: Can incorporate additional node/edge features
  • Semi-Supervised: Can predict on partially observed data

Limitations

  • Computational Cost: GNN training slower than simpler methods
  • Graph Construction: Requires building full bipartite graph
  • Cold Start: New queries/models need graph re-construction
  • Hyperparameter Sensitivity: Many architectural choices

When to Use Graph Router

Good Use Cases:

  • Large datasets with rich relational structure
  • Query-model relationships exhibit network effects
  • Have LLM embeddings or features beyond performance
  • Want to model higher-order interactions

Alternatives:

  • Simple relationships → Use MLP/SVM Router
  • Small datasets → Use KNN Router
  • Need fast training → Use ELO Router
  • RouterDC: Also uses structured learning but with contrastive loss
  • MF Router: Learns latent spaces but without graph structure
  • MLP Router: Standard neural network, no graph

For questions or issues, please refer to the main LLMRouter documentation or open an issue on GitHub.