NetArena: Dynamic Benchmarks for AI Agents in Network Automation
February 28, 2026 ยท View on GitHub
Overview
NetArena is a dynamic benchmark generation framework for evaluating LLM agents in real-world network applications. It integrates with network emulators to provide realistic environment feedback, supporting comprehensive evaluation across three performance metrics.
Paper
The research behind NetArena is detailed in our paper:
NetArena: Dynamic Benchmarks for AI Agents in Network Automation. ICLR 2026. [paper]
@inproceedings{
zhou2026netarena,
title={NetArena: Dynamic Benchmarks for AI Agents in Network Automation},
author={Yajie Zhou and Jiajun Ruan and Eric S. Wang and Sadjad Fouladi and Francis Y. Yan and Kevin Hsieh and Zaoxing Liu},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=BPVPOtzoOz}
}
๐ณ Assessing with Docker
Each benchmark app ships as an A2A agent following the Agentbeats green agent format.
๐ How It Works
- Benchmark apps receive evaluation requests via the A2A protocol
- Requests describe the A2A agents involved (endpoints, API keys, etc.) and benchmark configuration (query difficulty, number of queries, etc.)
- This initiates a round of evaluation against your agent
๐ฃ Build Your Own Purple Agent
Want to build your own agent? Check out our example prompt-based agents for reference:
| App | Prompt Agent |
|---|---|
| MALT | app-malt/old_code/prompt_agent.py |
| Route | app-route/old_code/prompt_agent.py |
| K8s | app-k8s/old_code/prompt_agent.py |
๐ Submit Your Results
Ready to see how your agent stacks up? Visit the NetArena Leaderboard for submission instructions and to compare against other agents.
The following section includes instructions on how to build the container for each application.
Prerequisites
- Linux-based system (tested on Ubuntu 20.04)
- At least 4 CPU cores
- Docker installed and running
Quick Start
1. Build the Container
From the NetArena root directory, build the container for your target app:
cd ~/NetArena
# For MALT app
docker build -t malt_agent:latest -f ./app-malt/green_agent/Dockerfile .
# For K8s app
docker build -t k8s_agent:latest -f ./app-k8s/green_agent/Dockerfile .
# For Route app
docker build -t route_agent:latest -f ./app-route/green_agent/Dockerfile .
(Optional) For testing purposes, you may also build the baseline purple agent we use in this demo as a Docker container:
cd ~/NetArena
# Test purple agent.
docker build -t litellm_agent:latest -f ./a2a_llm/Dockerfile .
2. Start the Container
See How to Start the Container for Each App below for app-specific commands.
If you wish to also run the baseline purple agent, the process is similar to the ones above. Just remember to pass in relevant environment variables
docker run -itd --network=host --env-file "./env.list" --name purple_agent litellm_agent:latest --host "0.0.0.0" --port 8000
Example ./env.list file:
AZURE_API_KEY="<YOUR_API_KEY>"
AZURE_API_BASE="<YOUR_API_ENDPOINT>"
AZURE_API_VERSION="<YOUR_API_VERSION>"
MODEL_NAME="azure/XXX"
For details on how to configure environment variables for a certain model provider, see the LiteLLM documenation.
How to Start the Container for Each App
โ ๏ธ Important: When switching between apps, always remove the existing
green_agentcontainer first withdocker rm -f green_agent. All apps share the same container name.
MALT App
Standard setup - no additional flags required:
# Remove any existing green_agent container first
docker rm -f green_agent
docker run -itd --network=host --name green_agent malt_agent:latest --host "0.0.0.0" --port 9999
Route App
Requires --privileged flag for Mininet to function:
# Remove any existing green_agent container first
docker rm -f green_agent
docker run -itd --network=host --privileged --name green_agent route_agent:latest --host "0.0.0.0" --port 9999
Sometimes, you may also have to include the following flag to mount kernel modules for Open vSwitch:
-v /lib/modules:/lib/modules
K8s App
Requires access to a Kubernetes cluster.
docker rm -f green_agent
docker run -itd --network=host \
-v <KUBECONFIG_PATH>:/root/.kube/config \
-e KUBECONFIG=/root/.kube/config \
-v <NETARENA_ROOT>/app-k8s/microservices-demo:/data/microservices-demo \
--name green_agent k8s_agent:latest --host "0.0.0.0" --port 9999
Replace:
<KUBECONFIG_PATH>โ Your kubeconfig file (default:~/.kube/config, orapp-k8s/configif included)<NETARENA_ROOT>โ Absolute path to NetArena repo (e.g.,/home/user/NetArena)
No remote cluster? Follow the Kubernetes in Docker (KinD) installation instructions to create a local cluster.
Troubleshooting
K8s connection refused
If kubectl commands fail with "connection refused":
- Verify your kubeconfig is mounted correctly
- Check the remote cluster is reachable:
nc -zv <CLUSTER-IP> 6443 - Ensure the
KUBECONFIGenv var is set in the container
Agent not responding
Check if the container is healthy:
docker logs --tail 50 green_agent
curl http://127.0.0.1:9999/.well-known/agent-card.json