CI/CD
June 25, 2026 · View on GitHub
Overview
MockServer uses two CI/CD systems:
graph LR
subgraph "Buildkite"
BK["Primary CI
Build & Test"]
BK_MAVEN["Docker Push
Maven CI Image"]
BK_CLEANUP["PR Cleanup
Cancel & Delete"]
end
subgraph "GitHub Actions"
GA_CODEQL["CodeQL Analysis
Security scanning"]
end
subgraph "GitHub Built-in"
GH_DEP["Dependency Graph
Auto-indexed"]
end
BK -->|runs on| EC2[AWS EC2 Agents]
BK_MAVEN -->|pushes to| DH[Docker Hub]
BK_CLEANUP -->|triggered by| GH_WH[GitHub Webhook]
GA_CODEQL -->|reports to| GH_SEC[GitHub Security]
CI Security Model
All custom CI pipelines run on Buildkite with self-managed EC2 agents. This keeps secrets (API tokens, Docker Hub credentials, AWS credentials) within the Buildkite/AWS boundary and avoids exposing them as GitHub Actions secrets.
Principle: Use Buildkite for any pipeline that needs secrets or performs actions. Use GitHub Actions only for read-only analysis that requires no secrets (e.g., CodeQL). Use Buildkite Pipeline Triggers to react to GitHub events without giving GitHub access to CI credentials.
| Concern | Approach |
|---|---|
| Build & test | Buildkite (EC2 agents, secrets in AWS Secrets Manager) |
| Docker push | Buildkite (Docker Hub credentials in AWS Secrets Manager) |
| GitHub event reactions | Buildkite Pipeline Triggers (GitHub webhook → Buildkite, no secrets in GitHub) |
| Security scanning | GitHub Actions CodeQL (read-only, no secrets needed) |
| Dependency graph | GitHub built-in (auto-indexed from manifests, no workflow needed) |
Buildkite Pipelines
The monorepo uses a path-based pipeline orchestrator that dynamically triggers separate child pipelines based on changed files. Each child pipeline appears individually in the Buildkite dashboard, giving per-project visibility. Three agent queues are used:
| Queue | Instance Types | Purpose |
|---|---|---|
default | c5.2xlarge, c5a.2xlarge, m5.2xlarge | Build and test workloads (Maven, Docker, k3d) |
trigger | t3.small, t3a.small, t3.micro | Trigger polling jobs (sleep + curl loops) |
release | Same as default | Release pipeline steps that access release secrets |
perf | c5.4xlarge | Daily performance-regression benchmarks (k6 + JMH); scale-to-zero, max 1, 100% on-demand |
Trigger jobs (which poll child builds via the Buildkite API) run on cheap trigger queue instances to avoid starving build agents. See Agent Starvation for background.
Pipeline Orchestrator
File: .buildkite/scripts/generate-pipeline.sh
The orchestrator runs as the first step of every build (via the main "MockServer" pipeline). It determines which files changed since the last successful build and emits command steps that call trigger-pipeline.sh to create child builds via the Buildkite API. For PRs, it diffs against the merge-base. For pushes to master, it queries the Buildkite API for the last successful build's commit SHA and diffs against it — this ensures that batch pushes with multiple commits correctly trigger all affected pipelines. If the base commit cannot be determined (API failure, first build, shallow clone), the orchestrator conservatively triggers all pipelines.
flowchart TD
PUSH[Push / PR] --> ORCHESTRATOR["MockServer pipeline
generate-pipeline.sh"]
ORCHESTRATOR --> DIFF["Compute changed files
git diff against base"]
DIFF --> MATCH{"Match changed paths
against rules"}
MATCH -->|mockserver/ excluding maven-plugin, or mockserver-ui/| JAVA["trigger: mockserver-java"]
MATCH -->|mockserver-ui/| UI["trigger: mockserver-ui"]
MATCH -->|mockserver-node/ or mockserver-client-node/| NODE["trigger: mockserver-node"]
MATCH -->|mockserver-client-python/| PYTHON["trigger: mockserver-python"]
MATCH -->|mockserver-client-ruby/| RUBY["trigger: mockserver-ruby"]
MATCH -->|mockserver/mockserver-maven-plugin/| MAVEN_PLUGIN["trigger: mockserver-maven-plugin"]
MATCH -->|mockserver-performance-test/| PERF["trigger: mockserver-performance-test"]
MATCH -->|container_integration_tests/| CONTAINER["trigger: mockserver-container-tests"]
MATCH -->|jekyll-www.mock-server.com/| WEBSITE["trigger: mockserver-website"]
MATCH -->|docker_build/maven/| BUILD_IMG["trigger: mockserver-build-image"]
MATCH -->|".buildkite/ .github/ terraform/ scripts/ examples/ OpenAPI spec etc."| INFRA["trigger: mockserver-infra"]
MATCH -->|no match| DEFAULT["inline: no-op step"]
The mockserver-infra pipeline (pipeline-infra.yml) runs lightweight validation
steps in Docker: opencode config lint, shell-script lint, Dockerfile sync, Helm
chart validation, and API-collection validation. The collection step
(collections-validate.sh) regenerates the Postman and Bruno collections from the
OpenAPI spec and fails if the committed examples/postman/** or examples/bruno/**
have drifted — so examples/ and jekyll-www.mock-server.com/mockserver-openapi.yaml
also route to this pipeline.
Buildkite Pipelines
All pipelines are managed via Terraform in terraform/buildkite-pipelines/pipelines.tf. Only the main orchestrator pipeline triggers from GitHub webhooks; all child pipelines have trigger_mode = "none" and are triggered by the orchestrator.
| Pipeline (Buildkite slug) | Pipeline File | Trigger | What It Builds |
|---|---|---|---|
mockserver | pipeline.yml | GitHub push/PR | Orchestrator — triggers child pipelines |
mockserver-java | pipeline-java.yml | Orchestrator | Full Maven build and test |
mockserver-ui | pipeline-ui.yml | Orchestrator | UI lint, typecheck, test, build |
mockserver-node | pipeline-node.yml | Orchestrator | Node.js lint and typecheck |
mockserver-python | pipeline-python.yml | Orchestrator | Python unit + integration tests (builds MockServer image from HEAD) |
mockserver-ruby | pipeline-ruby.yml | Orchestrator | Ruby unit + integration tests (builds MockServer image from HEAD) |
mockserver-maven-plugin | pipeline-maven-plugin.yml | Orchestrator | Maven plugin build and test |
mockserver-performance-test | pipeline-perf-test.yml | Orchestrator | Perf test script validation |
mockserver-container-tests | pipeline-container-tests.yml | Orchestrator | Shell script validation |
mockserver-website | pipeline-website.yml | Orchestrator | Jekyll site build |
mockserver-infra | pipeline-infra.yml | Orchestrator | Infrastructure validation |
mockserver-build-image | docker-push-maven.yml | Orchestrator + Manual | Build/push maven CI image |
mockserver-release | release-pipeline.yml | Manual | Automated release pipeline (TOTP, Maven Central, maven-plugin, Docker Hub + ECR Public, npm, Helm, Javadoc, SwaggerHub, website, JSON Schema, PyPI, RubyGems, GitHub Release, optional versioned site) |
mockserver-cleanup | pipeline-cleanup.yml | GitHub webhook + scheduled | Clean up builds for closed PRs |
mockserver-perf-regression | pipeline-perf-test.yml | Daily Buildkite schedule (04:00 UTC) | Daily performance-regression pipeline — guard + k6 run + JMH microbench + rolling-baseline compare |
A single commit can trigger multiple child pipelines if it changes files in multiple areas. For example, a commit touching both mockserver/ and mockserver-ui/ triggers both mockserver-java and mockserver-ui pipelines.
All pipelines have cancel_intermediate_builds and skip_intermediate_builds enabled, but cancellation of running builds is scoped to non-master branches via cancel_intermediate_builds_branch_filter = "!master" (set uniformly in terraform/buildkite-pipelines/pipelines.tf). When a new build arrives for the same feature/PR branch (e.g. Dependabot rebases a PR), Buildkite cancels the running build to save agent VMs, and native trigger steps cancel the child builds too. On master, running builds are never cancelled: they always run to completion and report true pass/fail. Cancelling a master build mid-run would (a) leave that commit untested and (b) surface as a misleading failure on the parent pipeline whose trigger step was waiting on the cancelled child. Queued (not-yet-started) builds are still skipped on all branches — those report as "skipped" (neutral), not red.
Closed PR Build Cleanup
File: .buildkite/pipeline-cleanup.yml
When a PR is closed or merged, its Buildkite builds are no longer needed. The cleanup pipeline cancels any running builds and deletes all builds for the closed PR's branch across all child pipelines. This keeps the Buildkite dashboard clean — only builds for open PRs and master are visible.
The cleanup pipeline operates in two modes:
- Webhook-triggered (primary): A Buildkite Pipeline Trigger receives GitHub
pull_request:closedwebhooks directly. The webhook payload is available to the build step viabuildkite-agent meta-data get buildkite:webhook. This provides immediate cleanup when a PR is closed. - Scheduled sweep (safety net): A daily cron schedule sweeps all pipelines for builds on branches whose PRs are no longer open on GitHub. This catches anything missed by the webhook.
Why Buildkite Pipeline Triggers instead of GitHub Actions
Buildkite Pipeline Triggers can receive GitHub webhooks directly with HMAC-SHA256 signature verification. This avoids storing a Buildkite API token as a GitHub Actions secret, keeping all CI credentials within the Buildkite/AWS boundary:
| Approach | Secrets exposed to GitHub | Event-driven | Complexity |
|---|---|---|---|
| Buildkite Pipeline Trigger | None (webhook URL only) | Yes | Low |
| GitHub Actions workflow | Buildkite API token | Yes | Low |
| AWS Lambda webhook receiver | None | Yes | High |
| Buildkite scheduled sweep only | None | No (polling) | Low |
Setup
Steps 1 and 4 are managed by Terraform (terraform/buildkite-pipelines/pipelines.tf). Steps 2 and 3 require manual setup because Buildkite Pipeline Triggers don't have a Terraform resource yet (the feature is in public preview).
- Pipeline + schedule (Terraform): Run
terraform applyinterraform/buildkite-pipelines/to create themockserver-cleanuppipeline and its daily schedule. - Pipeline Trigger (Buildkite UI): Go to the cleanup pipeline settings → Triggers → New Trigger → GitHub:
- Description:
GitHub PR closed/merged - Branch:
master, Commit:HEAD - Security: check "Validate webhook deliveries", enter a secret (
openssl rand -hex 32) - Copy the trigger URL (
https://webhook.buildkite.com/deliver/bktr_...)
- Description:
- GitHub webhook (GitHub UI): Go to repo webhook settings → Add webhook:
- Payload URL: paste the Buildkite trigger URL from step 2
- Content type:
application/json - Secret: same as step 2
- Events: select "Let me select individual events" → check only "Pull requests"
- Daily schedule (Terraform): Created automatically by step 1 — runs at 06:00 UTC daily as a safety net.
Performance Regression Pipeline
File: .buildkite/pipeline-perf-test.yml
Trigger: Daily Buildkite schedule at 04:00 UTC (build.source == 'schedule'), or via the Buildkite UI (build.source == 'ui'). Not triggered by the path-based orchestrator.
Purpose: Catch performance regressions automatically without requiring manual perf runs after every commit. The pipeline is notify-only — it never fails a build, only annotates.
Commit-guard dynamic-dispatch pattern
The pipeline's first step (perf-test-guard.sh, trigger queue) implements a "daily but only if there's something new" gate:
- Calls
last_perf_run_commit(inlib/last-successful-commit.sh) — resolves the commit the heavy regression run last actually executed against, by reading the most recentperf_regression_ran_commitBuildkite build meta-data (set byperf-test-run.sh) via the Buildkite API (token in AWS Secrets Managermockserver-build/buildkite-api-token). This is deliberately distinct from the siblinglast_successful_commit(last passed build, used bygenerate-pipeline.sh): the perf-test pipeline passes on its lint step on every push, so "last passed build" would almost always beHEADand the guard would skip forever. - If
HEADequals the last run commit, annotates "skipped" and exits 0 — no compute is consumed. - Otherwise (new commit, or no prior run recorded) uses
buildkite-agent pipeline uploadto dynamically inject the run, microbench, and compare steps into the running build. These three steps target theperfagent queue (c5.4xlarge, on-demand).
This pattern avoids a fixed multi-step pipeline definition (which would always run all steps) while keeping the guard cheap on the trigger queue.
Steps
| Step script | Queue | What it does |
|---|---|---|
perf-test-guard.sh | trigger | Commit guard + dynamic step upload |
perf-test-run.sh | perf | k6 regression.js (HTTP + HTTPS/H2) + growth.js + background sampler; uploads perf-result.json |
perf-test-microbench.sh | perf | JMH MatchingBenchmark with -prof gc; uploads perf-microbench.json |
perf-test-compare.sh | perf | Merge artifacts + S3 persist + rolling median+MAD compare + Buildkite annotation |
See Performance Tuning for the full description of behaviours, thresholds, result schema, and how to re-baseline.
CI Build Pipeline
File: .buildkite/pipeline-java.yml
Triggered by the orchestrator when files change in mockserver/ or mockserver-ui/. The pipeline has multiple sequential phases separated by - wait directives:
flowchart TD
CONFIG["1. opencode config validation
java-validate-config.sh"]
PULL["2. docker pull maven image"]
CONFIG --> PULL
PULL --> BUILD["3. Maven build
java-build.sh (in Docker)
./mvnw clean install"]
BUILD --> SUMMARY["4. Coverage summary
jacoco coverage annotation"]
SUMMARY --> DEPLOY["5. Deploy snapshot to Sonatype
master only"]
DEPLOY --> CTESTS["6. Container integration tests
master only"]
CTESTS --> PUSH["7. Build and push :snapshot
master only"]
PUSH --> BUNDLES["8. Publish snapshot binary bundles
master only, soft_fail"]
Step 1: Validate Config
Runs .buildkite/scripts/steps/java-validate-config.sh to lint opencode configuration files.
Step 2: Update Docker Image
Pulls the latest mockserver/mockserver:maven build image to ensure the CI environment is current.
Step 3: Build
Runs .buildkite/scripts/steps/java-build.sh, which executes the full Maven build inside the mockserver/mockserver:maven Docker image via run-in-docker.sh:
- Volume-mounts the repository into the container
- Passes the
BUILDKITE_BRANCHenvironment variable - Executes
scripts/buildkite_quick_build.shwhich runs./mvnw clean install - Memory limit: 7 GB
- Collects build artifacts:
.logfiles, the failing tests' reports plus their console output (mockserver/target/failed-tests/**, curated byjava-collect-failures.sh), the jacoco coverage XML and HTML tarball, and the shaded JAR. Per-classTEST-*.xmlfor passing classes are not uploaded — only failing-test artefacts appear in the build's artefact list, keeping it small (oneTEST-*.xmlper class otherwise produced ~650 artefacts that cluttered the list). A pass/fail summary is still printed at the end of the build log.
Step 4: Coverage Summary
Runs .buildkite/scripts/steps/java-summarize.sh to add a jacoco line-coverage annotation (and a link to download the jacoco-html-reports.tar.gz artefact) to the Buildkite build page. Runs with continue_on_failure: true / soft_fail: true so it never reddens the build. A separate diff-coverage gate (diff-coverage.sh, also soft_fail) annotates new-code coverage. The previous junit-annotate per-test-annotation step was removed along with the per-class TEST-*.xml upload it depended on; the end-of-log pass/fail summary and the failing-test artefacts cover failure triage instead.
Steps 5–8: Master-Only Steps
On master only, four additional steps run sequentially:
- Deploy snapshot:
.buildkite/scripts/steps/java-deploy-snapshot.sh— publishes SNAPSHOT artifacts to Sonatype - Container integration tests:
.buildkite/scripts/steps/container-tests-run.sh— runs Docker Compose and Helm integration tests - Build and push :snapshot:
.buildkite/scripts/steps/java-docker-push-snapshot.sh— builds and pushes the:snapshotand:mockserver-snapshotDocker images (:latestis only pushed during releases) - Publish snapshot binary bundles (
soft_fail):.buildkite/scripts/steps/java-publish-snapshot-bundles.sh— builds JVM-less binary bundles for all platforms (linux/x86_64, linux/aarch64, darwin/x86_64, darwin/aarch64, windows/x86_64) usingscripts/build-all-bundles.shand uploads them tos3://aws-binaries-mockserver/mockserver-<POM_VERSION>/, served athttps://downloads.mock-server.com/mockserver-<POM_VERSION>/.... Each master build overwrites the previous snapshot bundles. This provides working download URLs for the Go/.NET/Rust/Ruby/Python binary client launchers between releases. No GitHub token is required — the upload uses the agent's IAM instance role; the default-queue role needss3:PutObjectonarn:aws:s3:::aws-binaries-mockserver/*(provisioned via Terraform interraform/buildkite-agents/). Releases still use GitHub Releases (via the release pipeline). Thejlinkcross-build needs JDK 21, which this step bootstraps on demand (downloads Temurin 21 for the host and passes it viaJAVA_HOMEtobuild-all-bundles.shfor that one invocation) — the Maven build keeps running on JDK 17, so the Java-17 floor is still enforced and this step never changes the Maven JDK. Master-only (if: build.branch == 'master') andsoft_fail: true, so PR builds never publish and bundle-build failures never redden master.
Spot Resilience (agent-lost auto-retry)
The default agent queue is a mix of on-demand and Spot instances (see aws-infrastructure.md). When AWS reclaims a Spot instance mid-build, the Buildkite agent is lost and the running job ends with exit status -1 (or 255) — an infrastructure kill, not a test failure. The Maven build runs 15–25 minutes, so a reclaim part-way through used to fail the whole build and require a manual re-run (~2 Spot evictions/day were observed).
Two complementary mitigations:
automatic_retryon agent-lost — the long, non-soft_failcommand steps (:maven: build, deploy snapshot, container integration tests, build-and-push:snapshot) declareretry.automaticforexit_status: -1and255(limit: 2). A Spot reclaim silently re-queues the job onto a fresh agent instead of reddening the build. Real test failures exit1and are NOT retried, so this never masks genuine breakage.- Higher on-demand ratio — the default queue's
on_demand_percentagewas raised from 20% to 60% so a long build is much less likely to land on a Spot instance in the first place (the on-demand base capacity of 1 is unchanged).
Python and Ruby Client Integration Tests
Files: .buildkite/scripts/steps/python-integration-test.sh, .buildkite/scripts/steps/ruby-integration-test.sh
These pipelines run independently from the Java pipeline and do not have access to Java build artifacts. To test against the HEAD-built MockServer (not a stale :snapshot from Docker Hub), both scripts source a shared helper:
- Helper:
.buildkite/scripts/build-local-mockserver-image.sh— builds themockserver-netty-no-dependenciesshaded JAR from the Maven reactor (skipped if the JAR already exists), copies it intodocker/local/, and runsdocker buildto produce a local image taggedmockserver-under-test:local(configurable viaMOCKSERVER_IMAGEenv var).
The test fixtures (conftest.py for Python, integration_spec.rb for Ruby) also respect the MOCKSERVER_IMAGE env var when launching a container in standalone/local mode.
Maven CI Image Push Pipeline
File: .buildkite/docker-push-maven.yml
Trigger: Manual (via Buildkite UI or API)
Builds and pushes mockserver/mockserver:maven — the Docker image used by the CI build pipeline. Run this when:
docker_build/maven/Dockerfileordocker_build/maven/settings.xmlchange- Monthly, to pick up base OS security updates
- After upgrading Maven or JDK versions
flowchart LR
TRIGGER[Manual trigger] --> BUILD["Step 1: Build
maven-image-build.sh"]
BUILD --> PUSH["Step 2: Push
maven-image-push.sh
master only"]
The pipeline has two steps separated by a - wait directive:
- Build:
.buildkite/scripts/steps/maven-image-build.shbuilds themockserver/mockserver:mavenimage - Push (master only):
.buildkite/scripts/steps/maven-image-push.shauthenticates to Docker Hub via AWS Secrets Manager (mockserver-build/dockerhub) and pushes the image
Release Image Push (Docker step of the release pipeline)
Script: scripts/release/components/docker.sh, invoked as the :docker: Docker Image step of release-pipeline.yml (release-runner.sh docker).
Trigger: Runs automatically as part of the mockserver-release pipeline — there is no separate manual image pipeline.
Queue: release — needs mockserver-release/cosign-key (image signing) and the release-scoped Docker Hub / ECR push credentials.
Builds and pushes the production MockServer Docker images as multi-arch images (linux/amd64 + linux/arm64 via QEMU). Four image variants are published: main, GraalJS, clustered, and webhook. After push, each image is cosign-signed by digest, and the same digests are mirrored to GHCR.
The RELEASE_VERSION / tag is derived from the release pipeline context.
Tags pushed per image:
mockserver/mockserver:mockserver-X.Y.Z+:X.Y.Z+:latest(main, GraalJS, clustered variants)mockserver/mockserver-webhook:mockserver-X.Y.Z+:X.Y.Z(admission webhook)- Same tags to ECR Public (URI resolved dynamically via
aws ecr-public describe-repositories) and mirrored to GHCR
flowchart LR
TRIGGER["release-runner.sh docker
(release pipeline step)"] --> LOGIN["Docker Hub + ECR login
via Secrets Manager"]
LOGIN --> ECR_RESOLVE["Resolve ECR URI dynamically
ecr-public describe-repositories"]
ECR_RESOLVE --> BUILD["docker buildx build
linux/amd64 + linux/arm64"]
BUILD --> PUSH["Push main + GraalJS + clustered + webhook
:mockserver-X.Y.Z + :X.Y.Z + :latest"]
PUSH --> SIGN["cosign sign by digest
+ mirror digests to GHCR"]
The ECR repository URI is resolved at runtime via aws ecr-public describe-repositories rather than hardcoded — the registry alias is AWS-assigned and must not be hardcoded (scripts/release/components/docker.sh).
Release Pipeline Security
File-based secrets (no -e in docker run)
All release scripts that run toolchains inside Docker containers (scripts/release/components/maven-central.sh, maven-plugin.sh, helm.sh, docker.sh) write secrets to 0600 files under .tmp/ and read them from inside the container via mounted volume, rather than passing them as docker run -e VAR=value. Environment variables are readable from /proc/1/environ and via docker inspect; file-based secrets under .tmp/ are not.
| Secret | File pattern | Removed from container via |
|---|---|---|
| GPG key (base64) | .tmp/gpg-key.$PID | trap cleanup function on EXIT |
| GPG passphrase | .tmp/gpg-passphrase.$PID | same trap |
| Sonatype credentials | .tmp/sonatype-creds.$PID (username\npassword) | same trap |
| GHCR token | .tmp/ghcr-creds.$PID (username\ntoken) | trap ... EXIT in helm.sh |
| cosign key | .tmp/cosign-key.$PID | removed after signing |
| cosign password | .tmp/cosign-pw.$PID | removed after signing |
| Sonatype netrc | .tmp/sonatype-netrc.$PID | trap ... EXIT in polling loop |
Curl calls to the Sonatype Central Portal API use --netrc-file rather than Authorization: Basic <base64> in a shell variable, so credentials are not held in the shell environment across the 30-minute polling loop.
TOTP tolerance window (by design)
The TOTP verification step (release-verify-totp.sh) accepts ±5 minutes of clock skew (TOTP_TOLERANCE_WINDOWS=10). This is intentional — release-queue agents scale to zero, so the agent that runs the verifier cold-starts after the operator enters the code in the Buildkite block step. The Lambda autoscaler poll, EC2 spot acquisition, and agent bootstrap together take up to ~2.5 minutes. A standard ±1-window tolerance would produce false rejections on every cold-start without adding security, because the allowed_teams: ["release-managers"] gate on the block step is the primary access control.
To change this behaviour: either pre-warm the release queue or move TOTP validation into the block step itself (which runs in the Buildkite control plane, not on an agent).
Docker image cosign signing
After pushing release images to Docker Hub and ECR, the release pipeline cosign-signs each image digest using the same key infrastructure as Helm chart signing (mockserver-release/cosign-key in Secrets Manager). Signing is by digest so the signature binds to the exact manifest content, not a mutable tag.
Signing is strictly non-fatal: if the cosign key is absent or the binary is not installed, the images remain published and the release continues. The guard is:
if aws secretsmanager describe-secret --secret-id mockserver-release/cosign-key; then
# sign
fi
See Docker image verification for how to verify a signed image.
Build Docker Image
The mockserver/mockserver:maven image is defined in docker_build/maven/Dockerfile:
- Base: Ubuntu 24.04 (Noble)
- JDK: OpenJDK 17
- Maven: 3.9.16 (manually installed from Apache)
- Dependencies: Pre-fetched by running a throwaway build during image creation
- Corporate CA: Optional certificate injection for TLS proxy environments (see Docker)
Docker Registry Authentication
Docker push pipelines authenticate to two registries:
Docker Hub — credentials stored in AWS Secrets Manager (mockserver-build/dockerhub):
{"username": "...", "token": "..."}
The shared script .buildkite/scripts/docker-login.sh fetches the secret and runs docker login.
AWS ECR Public — authenticated via IAM instance role (no stored credentials needed):
The shared script .buildkite/scripts/ecr-login.sh runs aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws.
Buildkite agent EC2 instances have IAM permissions for both Docker Hub secret access and ECR Public push (via managed_policy_arns in terraform/buildkite-agents/main.tf).
All Docker push scripts call both login scripts and push tags to both registries in a single docker buildx build command.
Managing Buildkite Pipelines
Pipelines are managed via Terraform in terraform/buildkite-pipelines/. The Terraform stack includes all 15 pipelines (orchestrator, 11 child pipelines, 2 Docker image push pipelines, and 1 release pipeline), each pointing to mock-server/mockserver-monorepo.git. To add a new pipeline:
- Create the pipeline YAML in
.buildkite/ - Add an entry to
local.pipelinesinterraform/buildkite-pipelines/pipelines.tf - Add a
trigger_if_changedcall in.buildkite/scripts/generate-pipeline.sh - Run
terraform applyinterraform/buildkite-pipelines/
The Buildkite API token is stored in AWS Secrets Manager (mockserver-build/buildkite-api-token) and is used by the Terraform Buildkite provider for pipeline management.
Checking build status from the command line
Use scripts/ci/bk-pipeline-status.sh to query (or watch) a pipeline build instead of hand-writing bk/curl calls each time. It wraps the reliable bk build list / bk job log commands — preferred over bk auth token + curl to the REST API and over the Secrets Manager tokens, because only the locally-authenticated bk CLI dependably has both build-state and read_build_logs scope, and it needs no AWS SSO session.
scripts/ci/bk-pipeline-status.sh -p mockserver-java -c <commitSha> # one-shot status
scripts/ci/bk-pipeline-status.sh -p mockserver-java -c <commitSha> --watch # poll until terminal
scripts/ci/bk-pipeline-status.sh -p mockserver-java -b <number> --logs # tail the failing job log
scripts/ci/bk-pipeline-status.sh -p mockserver-java -b <number> \
--grep 'BUILD FAILURE|<<< (FAILURE|ERROR)|npm error' # find the failure in the whole log
It prints build#<n> <commit> build=<state> <job>=<state> exit=<code> and exits 0 (passed) / 2 (failed) / 3 (timeout). For continuous watching, drive it with the agent Monitor tool in --watch mode (see the build-monitor skill).
GitHub Actions
Two workflows run on GitHub Actions, both triggered automatically on push and pull requests.
CodeQL Security Analysis
File: .github/workflows/codeql-analysis.yml
Triggers:
- Push to
master - Pull requests targeting
master - Weekly schedule: Tuesdays at 22:00 UTC
Languages scanned: Java, JavaScript, Python, Ruby
Process:
flowchart LR
TRIGGER[Push/PR/Schedule] --> CHECKOUT[Checkout code]
CHECKOUT --> SETUP_JDK[Set up JDK 17]
SETUP_JDK --> INIT[Initialize CodeQL]
INIT --> BUILD[Maven compile
skip tests]
BUILD --> ANALYZE[CodeQL Analysis]
ANALYZE --> REPORT[Report to GitHub Security]
The workflow:
- Checks out the repository
- Sets up JDK 17 (Temurin distribution)
- Initializes CodeQL for Java, JavaScript, Python, and Ruby
- For Java: Runs
./mvnw clean compile -DskipTests -Dmaven.javadoc.skip=true(CodeQL autobuild) - For JavaScript, Python, and Ruby: Analyzes source files directly (no build required)
- Performs CodeQL static analysis to detect security vulnerabilities
- Uploads results to GitHub Security tab
Results: Vulnerabilities appear in the repository's Security tab under "Code scanning alerts".
Maven Dependency Submission
GitHub's built-in dependency graph automatically indexes all manifest files (pom.xml, package.json, Gemfile, requirements.txt) and their transitive dependencies. No custom workflow is needed.
Powers:
- Dependency insights in the repository (Insights → Dependency graph)
- Dependabot vulnerability alerts for transitive dependencies
- Dependency review in pull requests (shows dependency changes and known vulnerabilities)
Note: A custom dependency-submission.yml workflow was previously used but was removed because it never worked (the workflow failed on every run due to a GitHub-level configuration issue). The built-in dependency graph provides equivalent coverage.
Build Agent Infrastructure
See AWS Infrastructure for details on the Buildkite agent EC2 instances, AutoScaling Group, and Lambda-based autoscaler.
Buildkite CLI Access
The Buildkite CLI (bk) provides authenticated access to builds, pipelines, and agents from the terminal. It uses browser-based OAuth login (similar to aws sso login) — no long-lived API tokens to manage.
Install
brew tap buildkite/buildkite
brew install buildkite/buildkite/bk
Or download a binary from the GitHub releases page.
Authenticate
bk auth login
This opens a browser window for OAuth login to Buildkite (similar to aws sso login). Once authenticated, the CLI stores credentials in the macOS keychain. No API token creation or manual secret management required.
After login, select the organization:
bk auth switch mockserver
Verify
bk auth status
Common Operations
The bk CLI uses -p {pipeline} for pipeline selection. The organization is set globally via bk auth switch.
# List recent builds
bk build list -p mockserver
# View a specific build
bk build view 3292 -p mockserver
# View a build as JSON
bk build view 3292 -p mockserver --json
# Cancel a build
bk build cancel 3292 -p mockserver -y
# Rebuild (retrigger) a build
bk build rebuild 3292 -p mockserver -y
# List agents (across all pipelines in the org)
bk agent list
# List agents as JSON
bk agent list --json
REST API Token (via CLI)
The bk CLI can extract its OAuth token for use with the REST API:
TOKEN=$(bk auth token)
curl -sH "Authorization: Bearer $TOKEN" \
"https://api.buildkite.com/v2/organizations/mockserver/pipelines/mockserver/builds/3292"
This avoids creating and managing separate API tokens. The token is the same OAuth token created by bk auth login.
Reading build logs requires the
bkCLI token — not the Secrets Manager API tokens. The Buildkite API tokens in Secrets Manager (mockserver-build/buildkite-api-tokenand-readonly) are scoped for build state, triggering, and retrying jobs, but lack theread_build_logsscope, so/jobs/<id>/logreturns"doesn't have the read_build_logs scope". Usebk auth token(above) orbk apiwith the locally-authenticated CLI:bk api "pipelines/mockserver-release/builds/<N>/jobs/<JOB_ID>/log" \ | python3 -c "import sys,json; print(json.load(sys.stdin).get('content',''))"The
chrome-devtoolsMCP browser cannot read the UI either — it is a separate browser profile that is not logged into Buildkite.
Opencode Integration
Once bk is installed and authenticated, opencode agents can use it directly for build operations (cancel, rebuild, inspect) without needing a separate API token. The bk CLI is the recommended approach.
Note: bk auth login requires an interactive TTY (browser OAuth flow), so it must be run by the user in a separate terminal before opencode can use bk commands. If the agent detects bk is not authenticated, it will prompt the user to run bk auth login manually.
Agent Starvation from Script-Based Triggers (Resolved)
Problem
The orchestrator emits command steps that run trigger-pipeline.sh, which creates a child build via the Buildkite API and then polls until completion (up to 2 hours). Each polling trigger job occupies an agent slot while doing essentially nothing — just sleep 30 + curl in a loop.
When multiple commits land on master in quick succession (e.g. from concurrent opencode sessions), each parent build triggers ~6 child pipelines, and each trigger job holds an agent:
| Concurrent parent builds | Trigger jobs (agents blocked polling) | Agents remaining for actual work |
|---|---|---|
| 1 | ~6 | 4 of 10 |
| 2 | ~12 | 0 of 10 (starvation) |
| 3 | ~18 | 0 of 10 (starvation, queued jobs can't start) |
Cancel intermediate (running) builds is set to !master (disabled on master) on every pipeline because cancelling on master drops legitimate builds and shows misleading failures. This filter is now applied uniformly in Terraform (terraform/buildkite-pipelines/pipelines.tf). Previously several child pipelines (mockserver-container-tests, mockserver-performance-test, mockserver-infra, the per-client and release pipelines) had empty filters — so a fresh master commit cancelled the previous still-running build before it could report, which was especially disruptive for the long-running container-tests (~20m) and performance-test pipelines. The parent pipeline's trigger jobs still hold a (cheap, trigger-queue) agent while waiting on children.
Why Not Native Trigger Steps
Buildkite's native trigger step type would solve this — it doesn't consume an agent. However, native triggers cannot be used because PR build authorisation requires the script-based approach (the trigger script passes PR metadata and handles auth that native triggers don't support).
Options Investigated
Option A: Separate Agent Pool for Triggers (Recommended)
Add a second, cheap agent stack on small instances (e.g. t3.small or t3.micro) dedicated to the trigger queue. Trigger jobs run on tiny instances while real work runs on the existing default queue.
| Property | default queue (current) | trigger queue (new) |
|---|---|---|
| Instance types | c5.2xlarge, c5a.2xlarge, m5.2xlarge | t3.small, t3.micro |
| Cost per instance (spot) | ~$0.06–0.12/hr | ~$0.004–0.008/hr |
| Max instances | 10 | 10–15 |
| Agents per instance | 1 | 3–5 (trigger jobs are idle polling) |
| Workload | Maven builds, Docker, k3d tests | sleep + curl polling loops |
| Memory needs | 7–16 GB | <256 MB |
Pros:
- Completely eliminates agent starvation — trigger jobs never compete with real work
- Very low cost (~$0.04/hr for 10 trigger agents vs ~$1/hr for 10 build agents)
- Simple Terraform change (add a third
module "buildkite_trigger_stack"block) - No pipeline YAML or script changes needed — only update
generate-pipeline.shto emitagents: { queue: trigger }for trigger steps
Cons:
- Adds a third ASG/Lambda scaler to manage
- Small increase in baseline infrastructure complexity
Implementation:
-
Add Terraform module in
terraform/buildkite-agents/main.tf:module "buildkite_trigger_stack" { source = "buildkite/elastic-ci-stack-for-aws/buildkite" version = "~> 0.7.0" stack_name = "buildkite-mockserver-trigger" buildkite_agent_token = var.buildkite_agent_token buildkite_queue = "trigger" instance_types = "t3.small,t3a.small,t3.micro" min_size = 0 max_size = 4 on_demand_percentage = 0 on_demand_base_capacity = 0 agents_per_instance = 4 associate_public_ip_address = true managed_policy_arns = [aws_iam_policy.read_buildkite_api_token.arn] } -
Update
generate-pipeline.shto target thetriggerqueue:STEPS="${STEPS} - label: \":pipeline: ${label}\" command: \".buildkite/scripts/trigger-pipeline.sh ${pipeline_slug} '${label}'\" timeout_in_minutes: 120 agents: queue: trigger "
Option B: Concurrency Groups on Trigger Steps
Add concurrency: 1 and concurrency_group: "trigger/<pipeline-slug>" to each trigger step. This ensures only one trigger job per child pipeline runs at a time — when build #4051 is already polling mockserver-java, build #4052's mockserver-java trigger queues instead of grabbing another agent.
Pros:
- No infrastructure changes — purely a pipeline YAML change
- Reduces worst-case agent consumption from N×6 to 6 (one per child pipeline)
Cons:
- Builds become serialised — build #4052 can't start
mockserver-javauntil #4051 finishes - Still wastes 6 expensive agents on polling (just caps it at 6 instead of unlimited)
- Increases total build wall-clock time for master
Option C: Increase Max Agents
Raise max_size from 10 to 20+ to accommodate concurrent builds.
Pros:
- Simple — change one number in
terraform.tfvars - No pipeline changes needed
Cons:
- Doubles cost during burst periods (~$1.20/hr → ~$2.40/hr with c5.2xlarge)
- Doesn't fix the root cause — trigger jobs still waste expensive instances
- Cost scales linearly with concurrency
Option D: Cancel Intermediate Builds on Master
Enable cancel_running_branch_builds for master (remove !master filter).
Pros:
- Frees agents immediately when a newer commit arrives
- No infrastructure cost increase
Cons:
- Drops legitimate builds — if commit A contains a real bug and commit B arrives, commit A's build is cancelled before it finishes, so the bug is never tested against commit A's code
- Current
trigger-pipeline.shhascancel_child_buildtrap logic that would also cancel child builds mid-run - Not suitable for master where every commit should be validated
Option E: Hybrid — Cheap Trigger Pool + Concurrency Groups
Combine Options A and B: run triggers on cheap instances AND limit concurrency per child pipeline. This provides both cost efficiency and prevents runaway concurrent builds.
Pros:
- Best of both approaches
- Trigger agents are cheap, AND concurrency is bounded
Cons:
- Most complex to implement
- Serialisation delays from concurrency groups may not be worth it if the cheap pool has enough capacity
Resolution
Option A (Separate Agent Pool) has been implemented. Trigger steps now target queue: trigger in generate-pipeline.sh, and a dedicated buildkite-mockserver-trigger stack runs on cheap t3.small/t3a.small/t3.micro instances with 4 agents per instance. This cleanly separates polling from building — trigger jobs never compete with real work for default queue agents.
Terraform: terraform/buildkite-agents/main.tf — module "buildkite_trigger_stack"
Pipeline: .buildkite/scripts/generate-pipeline.sh — agents: { queue: trigger }
If concurrent master builds remain a problem, Option E (adding concurrency groups) can be layered on top.
Dependency Caching
Each pipeline caches its dependency manager's artifacts in S3, keyed on lockfile hashes, to avoid re-downloading dependencies on every ephemeral agent. The cache is fail-safe by design -- every failure mode (missing bucket, missing credentials, non-root agent, corrupt tarball, empty cache) results in a clean no-op (exit 0) and a cold build proceeds normally.
Architecture
flowchart LR
RESTORE["cache-restore.sh
runs on host agent"] -->|download + extract| LOCAL[".buildkite-cache/TYPE/
workspace-local dir"]
LOCAL -->|volume mount via
run-in-docker.sh --cache| CONTAINER["Docker container
/root/.m2/repository
/root/.npm
etc."]
CONTAINER -->|build populates cache| LOCAL
LOCAL -->|tar + upload| SAVE["cache-save.sh
runs on host agent"]
RESTORE -->|"s3://mockserver-ci-
dependency-cache/TYPE/KEY.tar.gz"| S3[(S3 Bucket)]
SAVE -->|same key| S3
How It Works
-
Cache restore (pipeline step,
soft_fail: true):cache-restore.sh <type>computes a SHA-256 key from the relevant lockfiles, downloadss3://mockserver-ci-dependency-cache/<type>/<key>.tar.gz, and extracts it into$BUILDKITE_BUILD_CHECKOUT_PATH/.buildkite-cache/<type>/. If anything fails, it exits 0. -
Build (existing step):
run-in-docker.sh --cache <type>volume-mounts the workspace-local cache directory into the Docker container at the tool's default cache path (e.g.,/root/.m2/repositoryfor Maven,/root/.npmfor npm). If the directory is empty (cache miss), the build starts with a cold cache -- no different from before caching was enabled. -
Cache save (pipeline step,
soft_fail: true):cache-save.sh <type>tars the populated cache directory and uploads it to S3 with the same key. If the key already exists in S3 (cache hit on a previous build), the upload is skipped.
Cache Types and Keys
| Type | Lockfiles hashed | Container mount target |
|---|---|---|
maven | All pom.xml files in mockserver/ | /root/.m2/repository |
npm | package-lock.json + package.json from mockserver-ui/, mockserver-client-node/, mockserver-node/ | /root/.npm |
pip | pyproject.toml, setup.cfg, requirements.txt from mockserver-client-python/ | /root/.cache/pip |
bundler | Gemfile + Gemfile.lock from mockserver-client-ruby/, jekyll-www.mock-server.com/ | /usr/local/bundle/cache |
Fail-Safe Design
The previous caching attempt (reverted) broke builds by writing to /var/cache (requires root) and bridging state across ephemeral agents via host volumes. This redesign avoids both problems:
- No root-owned host paths: caches live under the workspace checkout directory, which the
buildkite-agentuser always owns - No cross-agent state: each job downloads its own cache from S3; no host-volume bridge between jobs
set -uo pipefailwithoutset -e: errors are handled inline, never propagatedsoft_fail: true: pipeline-level safety net -- even if the script somehow exits non-zero, the build continues- Credential check up-front:
aws sts get-caller-identityis tested before any S3 operation; if it fails, the script bails immediately with exit 0 - Idempotent keys: cache key is a pure function of lockfile content; same deps = same key = upload skipped
Activation
The S3 bucket and IAM policy are defined in terraform/buildkite-agents/dependency-cache.tf. The IAM policy is currently detached from all agent roles — the runtime pipeline wiring (cache-restore/cache-save steps) was reverted. The bucket and policy remain in place so the infrastructure is ready to re-enable once cache-integrity verification (signed or content-addressed entries) is implemented.
To re-activate: attach aws_iam_policy.dependency_cache to the relevant queues in main.tf and re-add the cache restore/save steps to the affected pipelines.
Until the IAM policy is re-attached, the cache scripts will detect missing credentials and no-op gracefully. No pipeline will break.
Local CI Simulation
To run the Buildkite build locally:
# Using the same Docker image as CI
scripts/local_buildkite_build.sh
# Or directly
docker run -v $(pwd):/build/mockserver \
-w /build/mockserver \
-a stdout -a stderr \
mockserver/mockserver:maven \
/build/mockserver/scripts/buildkite_quick_build.sh