E2E Tests
February 16, 2026 · View on GitHub
End-to-end tests for the Kepler Operator, validating the full operator lifecycle against a real Kubernetes or OpenShift cluster.
Running Tests
Prerequisites
- A running Kubernetes (Kind) or OpenShift cluster
- The operator deployed via OLM (
tests/run-e2e.shhandles this) kubectlconfigured to access the cluster
Full E2E Suite
The run-e2e.sh script builds the operator, deploys it via OLM, and runs all E2E tests:
# Build, deploy, and run all E2E tests
./tests/run-e2e.sh e2e
# Run on a VM environment (enables fake CPU meter)
./tests/run-e2e.sh e2e --ci -- -running-on-vm
# Run upgrade tests
./tests/run-e2e.sh upgrade
Skip Build/Deploy
When the operator is already deployed and you want to iterate on test code:
./tests/run-e2e.sh e2e --no-deploy
Run Specific Tests
Pass Ginkgo flags after -- to filter specs by name:
# Run only specs matching "Reconciliation"
./tests/run-e2e.sh e2e --no-deploy -- --ginkgo.focus "Reconciliation"
# Run only PowerMonitor specs
./tests/run-e2e.sh e2e --no-deploy -- --ginkgo.focus "PowerMonitor"
# Skip slow RBAC tests
./tests/run-e2e.sh e2e --no-deploy -- --ginkgo.skip "RBAC"
Run Against OpenShift
./tests/run-e2e.sh e2e --ns openshift-operators -- -openshift
Script Options
| Flag | Description |
|---|---|
--ci | CI mode (skips interactive checks) |
--no-deploy | Skip build and deploy; reuse existing operator deployment |
--no-builds | Skip image builds; useful when images are pre-built |
--ns NAMESPACE | Operator namespace (default: operators) |
--image-base URL | Image registry base (default: localhost:5001) |
Test Flags
These flags are passed after --:
| Flag | Default | Description |
|---|---|---|
-openshift | false | Run against an OpenShift cluster |
-running-on-vm | false | Enable VM test mode (fake CPU meter) |
-kepler-image | quay.io/sustainable_computing_io/kepler:v0.11.3 | Kepler image to use |
-deployment-namespace | power-monitor | Namespace for Kepler components |
Test Output
Logs are written to tmp/e2e/:
e2e.log-- full test output*-events.log-- Kubernetes events during the test runoperator-errors.log-- operator error logs (on failure)
Test Architecture
The E2E tests use Ginkgo v2 with Gomega matchers, following Operator SDK testing recommendations. Tests run against a real cluster (Kind in CI, any cluster locally) with the operator deployed via OLM.
Directory Structure
tests/
├── e2e/ # Test specs
│ ├── suite_test.go # Ginkgo suite bootstrap and flags
│ ├── power_monitor_test.go # PowerMonitor CR tests
│ ├── power_monitor_internal_test.go # PowerMonitorInternal tests
│ └── power_monitor_internal_secrets_test.go # Secret mount and lifecycle tests
├── utils/ # Test framework
│ ├── assertions.go # Polling helpers and assertion functions
│ ├── framework.go # Framework: resource CRUD, cleanup, cert management
│ └── power_monitor_internal_builder.go # Builder for PowerMonitorInternal specs
├── testdata/ # Test fixtures
├── run-e2e.sh # Main entry point for running E2E tests
└── helm.sh # Helm-specific E2E tests
Framework
| File | Purpose |
|---|---|
tests/e2e/suite_test.go | Ginkgo suite bootstrap, flag registration, default timeouts |
tests/utils/framework.go | Framework struct: Kubernetes client, resource CRUD, cleanup, cert-manager helpers |
tests/utils/assertions.go | Polling helpers (WaitUntil, ExpectResourceExists, etc.) using Eventually |
tests/utils/power_monitor_internal_builder.go | Builder pattern for constructing PowerMonitorInternal specs |
The Framework wraps a controller-runtime client.Client and provides helpers for:
- Resource lifecycle:
CreatePowerMonitor,DeletePowerMonitor,CreateTestSecret, etc. Each create method registers aDeferCleanupcallback so resources are automatically cleaned up when the spec completes. - Polling assertions:
ExpectResourceExists,ExpectPowerMonitorCondition, etc. use Gomega'sEventuallyto poll the cluster until a condition is met or the timeout expires. - Node operations:
AddResourceLabels,TaintNode-- modify node state with automatic cleanup viaDeferCleanup. - Cert-manager integration:
DeployOpenshiftCerts,CreateSelfSignedClusterIssuer, etc. for TLS certificate tests.
Test Structure
Tests are organized using Ginkgo's Describe/Context/It hierarchy:
Describe("PowerMonitor") # Top-level grouping by CRD
Describe("Reconciliation") # Feature area
It("should reconcile ...") # Specific behavior
Describe("NodeSelector")
It("should deploy only ...")
It("should remain ...")
Conventions
- All top-level
Describeblocks use theSerialdecorator to prevent parallel execution. These tests create cluster-scoped resources (PowerMonitor CRs, namespaces, node labels/taints) that would conflict if run concurrently. - Each
BeforeEachcreates a freshFrameworkinstance. TheFrameworkcreates a new controller-runtime client each time, but this is lightweight since it only establishes a REST config connection. By()annotations mark major logical steps in longer tests. This improves Ginkgo's output when a test fails, showing exactly which step was in progress.DeferCleanuphandles all resource teardown. You do not need to write explicit cleanup inAfterEach.
Test Files
| File | What it tests |
|---|---|
power_monitor_test.go | PowerMonitor CR: reconciliation, deletion, node selectors, taints/tolerations, config maps, RBAC mode, config fields, security context, secret lifecycle |
power_monitor_internal_test.go | PowerMonitorInternal CR: basic reconciliation, RBAC with TLS certificates |
power_monitor_internal_secrets_test.go | Secret mounting: single/multiple secrets, validation failures, lifecycle CRUD, duplicate mount paths, auto-redeploy on secret change |
Writing a New Test
1. Choose the right file
Add your test to the existing file that matches the CRD or feature you're testing. Create a new file only if testing a completely new CRD or cross-cutting concern.
2. Follow the pattern
Describe("YourFeature", func() {
It("should do something specific", func() {
// Pre-condition
f.ExpectNoResourceExists("power-monitor", "", &v1alpha1.PowerMonitor{})
// Action
f.CreateTestPowerMonitor("power-monitor", runningOnVM)
// Assertion (uses Eventually internally)
pm := f.ExpectPowerMonitorCondition("power-monitor",
v1alpha1.Available, v1alpha1.ConditionTrue)
// Verify specific fields
ds := &appsv1.DaemonSet{}
f.ExpectResourceExists(pm.Name, controller.PowerMonitorDeploymentNS, ds)
Expect(ds.Spec.Template.Spec.HostPID).To(BeTrue())
})
})
3. Key guidelines
- No manual cleanup: Use
f.CreateTestPowerMonitor/f.CreateTestSecret/ etc. which registerDeferCleanupautomatically. If you create resources with raw client calls, wrap cleanup inDeferCleanup. - Use Eventually for async state: Never assert on cluster state with a bare
Expect. Usef.ExpectResourceExists,f.ExpectPowerMonitorCondition, or wrap inEventually. - Use custom timeouts when needed: Default timeout is 2 minutes. For operations that may take longer (cert provisioning, RBAC setup), pass
utils.Timeout(5*time.Minute). - Guard node access: Always check
Expect(nodes).NotTo(BeEmpty())before accessingnodes[0]. - Support both Kubernetes and OpenShift: Use
if Cluster == k8s.Kubernetes { ... } else { ... }when behavior differs between platforms (e.g., cert-manager vs OpenShift service certificates).
CI Integration
E2E tests run on every PR via GitHub Actions (.github/workflows/pr-checks.yaml):
| CI Job | Command | What it does |
|---|---|---|
e2e | ./tests/run-e2e.sh e2e --ci -- -running-on-vm | Full E2E suite on Kind with 2 worker nodes |
operator-upgrade | ./tests/run-e2e.sh upgrade | Tests OLM bundle upgrade from previous release |
helm-e2e | .github/helm-e2e action | Helm chart installation and validation |
The run-e2e.sh script:
- Builds the operator and bundle images
- Pushes them to the Kind local registry
- Deploys the operator via
operator-sdk run bundle - Waits for the operator to be ready (including webhook validation)
- Runs
go test ./e2e/...with the configured flags - Collects event logs and operator error logs on failure
Test Artifacts
On failure, CI archives the tmp/e2e/ directory containing:
e2e.log-- full Ginkgo test output*-events.log-- Kubernetes events from operator and PowerMonitor namespacesoperator-errors.log-- filtered operator error logs