NGINX Gateway Fabric Testing

April 30, 2026 · View on GitHub

Overview

This directory contains the tests for NGINX Gateway Fabric. The tests are divided into two categories:

  1. Conformance Testing. This is to ensure that the NGINX Gateway Fabric conforms to the Gateway API specification.
  2. System Testing. This is to ensure that the NGINX Gateway Fabric works as expected in a real system.

Table of Contents

Prerequisites

  • Kubernetes cluster.
  • kind.
  • Docker.
  • Golang.
  • yq
  • Make.

If running NFR tests:

  • The gcloud CLI
  • A GKE cluster (if master-authorized-networks is enabled, please set ADD_VM_IP_AUTH_NETWORKS=true in your vars.env file)
  • Access to GCP Service Account with Kubernetes admin permissions

All the commands below are executed from the tests directory. You can see all the available commands by running make help.

Common steps for all tests

Step 1 - Create a Kubernetes cluster

Important: Functional tests can only be run on a kind cluster. Conformance tests can be run on kind or OpenShift clusters (see OPENSHIFT_CONFORMANCE.md for OpenShift instructions). NFR and longevity tests can only be run on a GKE cluster.

To create a local kind cluster:

make create-kind-cluster

Note: The default kind cluster deployed is the latest available version. You can specify a different version by defining the kind image to use through the KIND_IMAGE variable, e.g.

make create-kind-cluster KIND_IMAGE=kindest/node:v1.27.3

To create a GKE cluster:

Before running the below make command, copy the scripts/vars.env-example file to scripts/vars.env and populate the required env vars. GKE_SVC_ACCOUNT needs to be the name of a service account that has Kubernetes admin permissions, and GKE_NODES_SERVICE_ACCOUNT needs to be the name of a service account that has Artifact Registry Reader, Kubernetes Engine Node Service Account and Monitoring Viewer permissions.

make create-gke-cluster

Note: The GKE cluster is created with master-authorized-networks, meaning only IPs from explicitly allowed CIDR ranges will be able to access the cluster. The script will automatically add your current IP to the authorized list, but if your IP changes, you can add your new local IP to the master-authorized-networks of the cluster by running the following:

make add-local-ip-to-cluster

Note: If you already have a GKE cluster and your public IP has changed, update the firewall rule to include your new client IP. This restores connectivity when you’re unable to reach the VM.

make update-firewall-with-local-ip

Step 2 - Build and Load Images

Loading the images only applies to a kind cluster. If using GKE, see below

make build-images load-images TAG=$(whoami)

Or, to build NGF with NGINX Plus enabled (NGINX Plus cert and key must exist in the root of the repo):

make build-images-with-plus load-images-with-plus TAG=$(whoami)

For the telemetry test, which requires a OTel collector, build an image with the following variables set:

TELEMETRY_ENDPOINT=otel-collector-opentelemetry-collector.collector.svc.cluster.local:4317 TELEMETRY_ENDPOINT_INSECURE=true

GKE

If running tests on GKE, you will need to tag and push your images to a registry that is accessible from GKE. When building images to run on GKE, you'll need to specify GOARCH=amd64 in the build command if your local system doesn't default to that architecture.

If running the longevity tests manually for a release (not using pipeline), you should use the -rc images that were pushed as part of the release branch pipeline instead. For example, in your vars.env file

TAG=release-2.3-rc
PREFIX=ghcr.io/nginx/nginx-gateway-fabric
NGINX_PREFIX=ghcr.io/nginx/nginx-gateway-fabric/nginx
NGINX_PLUS_PREFIX=us-docker.pkg.dev/<GCP-PROJECT-ID>/nginx-gateway-fabric/nginx-plus

Conformance Testing

Step 1 - Install NGINX Gateway Fabric to configured kind cluster

Note: If you want to run the latest conformance tests from the Gateway API main branch, set the following environment variable before deploying NGF:

 export GW_API_VERSION=main

Otherwise, the latest stable version will be used by default. Additionally, if you want to run conformance tests with experimental features enabled, set the following environment variable before deploying NGF:

 export ENABLE_EXPERIMENTAL=true

If you want to run the Inference conformance tests, set the following environment variable before deploying NGF:

export ENABLE_INFERENCE_EXTENSION=true

Option 1 - Build and install NGINX Gateway Fabric from local to configured kind cluster

make install-ngf-local-build

Or, to install NGF with NGINX Plus enabled (NGINX Plus cert and key must exist in the root of the repo):

make install-ngf-local-build-with-plus

Option 2 - Install NGINX Gateway Fabric from local already built image to configured kind cluster

You can optionally skip the actual build step.

make install-ngf-local-no-build

Or, to install NGF with NGINX Plus enabled:

make install-ngf-local-no-build-with-plus

Step 2 - Build conformance test runner image

Note: If you want to run the latest conformance tests from the Gateway API main branch, run the following make command to update the Go modules to main in both the root and tests modules:

make update-go-modules

You can also point to a specific fork/branch by running:

go mod edit -replace=sigs.k8s.io/gateway-api=<your-fork>@<your-branch>
go mod download
go mod verify
go mod tidy

Otherwise, the latest stable version will be used by default.

make build-test-runner-image

Step 3 - Run Conformance tests

To run Gateway conformance tests

make run-conformance-tests

To run Inference conformance tests

make run-inference-conformance-tests

Step 4 - Cleanup the conformance test fixtures and uninstall NGINX Gateway Fabric

make cleanup-conformance-tests
make uninstall-ngf

Step 5 - Revert changes to Go modules

Optional Not required if you aren't running the main Gateway API tests.

make reset-go-modules

Step 6 - Delete kind cluster

make delete-kind-cluster

System Testing

The system tests are meant to be run on a live Kubernetes environment to verify a real system. These are similar to the existing conformance tests, but will verify things such as:

  • NGF-specific functionality
  • Non-Functional requirements (NFR) testing (such as performance, scale, longevity, etc.)

When running locally (functional tests), the tests create a port-forward from your NGF Pod to localhost using a port chosen by the test framework. Traffic is sent over this port. If running on a GCP VM targeting a GKE cluster (NFR/longevity), the tests will create an internal LoadBalancer service which will receive the test traffic.

Important: Functional tests can only be run on a kind cluster. NFR/longevity tests can only be run on a GKE cluster.

Directory structure is as follows:

  • framework: contains utility functions for running the tests
  • results: contains the results files for the NFR tests
  • scripts: contain scripts used to set up the environment and run the tests
  • suite: contains the test files

Logging in tests

To log in the tests, use the GinkgoWriter interface described here: https://onsi.github.io/ginkgo/#logging-output.

Step 1 - Run the tests

Run the functional tests locally

Run the full functional suite. By default this uses 4 parallel processes:

make test TAG=$(whoami)

Or, to run the tests with NGINX Plus enabled:

make test TAG=$(whoami) PLUS_ENABLED=true

GINKGO_PROCS controls how many parallel processes Ginkgo uses. Each process is an independent OS process with its own NGF deployment, namespace, and port range. Specs are distributed across them and run concurrently, reducing overall time. Cluster-wide CRDs are installed once before the parallel processes begin, while each process still performs its own per-process NGF install/setup. Set GINKGO_PROCS to roughly match the number of specs you intend to run to avoid unnecessary per-process installs.

For the graceful recovery tests, use GINKGO_PROCS=2. The nginx container and NGF pod restart scenarios run in parallel across two processes, while the node restart scenarios are marked Serial and run exclusively one at a time:

make test TAG=$(whoami) GINKGO_LABEL=graceful-recovery GINKGO_PROCS=2

When running a single test with GINKGO_LABEL, use GINKGO_PROCS=1 to avoid installing NGF on processes that receive no specs:

make test TAG=$(whoami) GINKGO_LABEL=telemetry GINKGO_PROCS=1

The command above doesn't run the telemetry functional test by default, which requires a dedicated invocation because it uses a specially built image (see above) and it needs to deploy NGF differently from the rest of functional tests.

Run the NFR tests on a GKE cluster from a GCP VM

Note: if you want to run the longevity tests from the pipeline, skip to the Longevity section.

Before running the below make commands, copy the scripts/vars.env-example file to scripts/vars.env and populate the required env vars. GKE_SVC_ACCOUNT needs to be the name of a service account that has Kubernetes admin permissions.

In order to run the tests in GCP, you need a few things:

  • GKE router to allow egress traffic (used by upgrade tests for pulling images from Github, and scale/reconfig tests for installing prometheus)
    • this assumes that your GKE cluster is using private nodes. If using public nodes, you don't need this.
  • GCP VM and firewall rule to send ingress traffic to GKE

To just set up the VM with no router (this will not run the tests):

make create-and-setup-vm

To set up just the router:

make create-gke-router

Otherwise, you can set up the VM, router, and run the tests with a single command. See the options below.

By default, the tests run using the version of NGF that was git cloned during the setup. If you want to make incremental changes and copy your local changes to the VM to test, you can run

make sync-files-to-vm

Note: if just running longevity tests, skip to the Longevity section.

To set up the GCP environment with the router and VM and then run the tests, run the following command:

make setup-gcp-and-run-nfr-tests

To use an existing VM to run the tests, run the following

make nfr-test
Longevity testing

This test is run on its own due to its long-running nature. It will run for 3 days (as defined in suite/scripts/longevity-wrk.sh) before the tester must collect the results and complete the test.

To run in the pipeline, run the workflow to start the tests. Once the workflow completes, the job ID will be included in the summary. This must be used as input when stopping the longevity tests.

After 3 days (72h) from the time that the startup workflow finished, visit the GCP Monitoring Dashboards page and select the NGF Longevity Test dashboard. Update the cluster_name filter to the names of the longevity clusters. Take PNG screenshots of each chart for the time period in which your test ran, and save those to be added to the results file. Then you can stop the longevity tests. If done too early, the traffic will still be flowing and results may not be collected properly, so be sure to wait the full time period.

The final workflow will tear down the test and open a PR with the results. The PNGs you took should be added, and any summaries as well. Combine any results files if necessary. If you don't want to open a PR, you can toggle it off in the input when running the workflow.

For running manually instead of using the pipeline, you can start the tests with

make start-longevity-test

and stop them with

make stop-longevity-test

Note if running from your machine instead of the pipeline: If you want to change the time period for which the test runs, update the wrk commands in suite/scripts/longevity-wrk.sh to the time period you want, and run make sync-files-to-vm.

Note if running from your machine instead of the pipeline: If you want to re-run the longevity test, you need to clear out the cafe.example.com entry from the /etc/hosts file on your VM.

Run the WAF tests on a GKE cluster

WAF tests require NGINX Plus with NAP WAF images and run on GKE (amd64 only). Before running:

  1. Ensure you have access to private-registry.nginx.com and a dockerconfig.jwt file at the repo root (used to create the image pull secret).

  2. Run the WAF tests, passing the required variables directly:

    make test-waf-gke TAG=$(whoami) PLUS_USAGE_ENDPOINT=<endpoint> GKE_PROJECT=<project>
    

    This will compile the WAF policy bundles from the JSON sources in suite/manifests/waf-policy/, create the image pull secret in the nginx-gateway namespace so the WAF sidecar images (waf-enforcer, waf-config-mgr) can be pulled from private-registry.nginx.com, and run the tests labelled waf against the GKE cluster.

Common test amendments

To run all tests with the label "my-label", use the GINKGO_LABEL variable:

make test TAG=$(whoami) GINKGO_LABEL=my-label

or to pass a specific flag, e.g. run a specific test, use the GINKGO_FLAGS variable:

make test TAG=$(whoami) GINKGO_FLAGS='-ginkgo.focus "writes the system info to a results file"'

Note: if filtering on NFR tests, set the filter in the appropriate field in your vars.env file.

If you are running the tests in GCP, add your required label/ flags to scripts/var.env.

You can also modify the tests code for a similar outcome. To run a specific test, you can "focus" it by adding the F prefix to the name. For example:

It("runs some test", func(){
    ...
})

becomes:

FIt("runs some test", func(){
    ...
})

This can also be done at higher levels like Context.

To disable a specific test, add the X prefix to it, similar to the previous example:

It("runs some test", func(){
    ...
})

becomes:

XIt("runs some test", func(){
    ...
})

For more information of filtering specs, see the docs here.

Step 2 - Cleanup

  1. Delete kind cluster, if required

    make delete-kind-cluster
    
  2. Delete the GCP components (GKE cluster, GKE router, VM, and firewall rule), if required

    make cleanup-gcp
    

    or

    make cleanup-router
    
    make cleanup-vm
    
    make delete-gke-cluster