Kubernetes Deployment

May 22, 2026 · View on GitHub

Deploy Nemesis to a lightweight Kubernetes cluster using either k3d (k3s-in-Docker) or native k3s, with Dapr operator-managed sidecars and KEDA event-driven autoscaling.

!!! tip For cloud deployment on AWS, see the EKS Deployment Guide.

!!! note Docker Compose remains the primary development environment. Kubernetes deployment is additive and intended for production-like environments and autoscaling testing. See the quickstart guide for Docker Compose setup.

System requirements are the same as Docker Compose (4 cores, 12+ GB RAM, 100 GB disk).

Quick Start (k3d)

k3d runs k3s inside Docker containers — ideal for local development and testing.

Prerequisites

ToolInstall
k3dcurl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash
kubectlSee docs
Helmcurl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
DockerMust be running

Dapr and KEDA are installed automatically via Helm by the setup script.

Setup

# 1. Create cluster with Traefik, Dapr, and KEDA
./k8s/scripts/setup-cluster-k3d.sh

# 2. Deploy using pre-built images from ghcr.io
./k8s/scripts/deploy.sh install

# 3. Verify everything is running
./k8s/scripts/verify.sh

# Access Nemesis at https://localhost:7443 (default user: n / password: n)

Teardown

# Delete cluster (preserves registry for faster rebuilds)
./k8s/scripts/teardown-cluster-k3d.sh

# Delete cluster AND registry
./k8s/scripts/teardown-cluster-k3d.sh --registry

Quick Start (k3s)

k3s runs natively on the host — suited for VMs, bare-metal servers, and production-like environments where Docker is not available or desired.

Prerequisites

ToolInstall
kubectlSee docs
Helmcurl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
curlSystem package manager

Docker is not required. k3s uses containerd directly.

Dapr and KEDA are installed automatically via Helm by the setup script.

Setup

# 1. Install k3s and configure Traefik, Dapr, KEDA
./k8s/scripts/setup-cluster-k3s.sh

# 2. Deploy using pre-built images from ghcr.io
./k8s/scripts/deploy.sh install

# 3. Verify everything is running
./k8s/scripts/verify.sh

# Access Nemesis at https://localhost:7443 (default user: n / password: n)

Teardown

# Remove everything including k3s
./k8s/scripts/teardown-cluster-k3s.sh

# Remove only Nemesis and Helm releases, keep k3s running
./k8s/scripts/teardown-cluster-k3s.sh --keep-k3s

Building Locally

The --build flag auto-detects the cluster type and uses the appropriate method:

  • k3d: Builds images and pushes to the k3d local registry
  • k3s: Builds images and loads them into k3s containerd via k3s ctr images import

Both require Docker to build images.

# Deploy using locally built images (auto-detects k3d or k3s)
./k8s/scripts/deploy.sh install --build

# Or build images separately:
./k8s/scripts/build-and-push-k3d.sh           # k3d: push to local registry
./k8s/scripts/build-and-load-k3s.sh           # k3s: load into containerd

# Build specific services only
./k8s/scripts/build-and-push-k3d.sh web-api frontend
./k8s/scripts/build-and-load-k3s.sh web-api frontend

!!! note k3s --build requires Docker on the same host to build images. The images are exported via docker save and imported into k3s containerd. The pods use imagePullPolicy: Never to ensure they use the locally-loaded images.

Deploy Script

./k8s/scripts/deploy.sh <action> [options]

Actions:
  install       Install or upgrade Nemesis
  uninstall     Remove Nemesis
  status        Show deployment status

Options:
  --build        Build images locally before deploying (k3d or k3s)
  --monitoring   Enable monitoring (deferred)
  --dry-run      Render templates without deploying
  --values FILE  Additional Helm values file
  --set KEY=VAL  Override a specific value

Examples

# Deploy with ghcr.io images
./k8s/scripts/deploy.sh install

# Deploy with locally built images (k3d or k3s)
./k8s/scripts/deploy.sh install --build

# Override credentials
./k8s/scripts/deploy.sh install \
  --set credentials.postgres.password=StrongPass \
  --set credentials.rabbitmq.password=StrongPass \
  --set credentials.s3.secretKey=StrongPass

# Disable autoscaling
./k8s/scripts/deploy.sh install --set autoscaling.enabled=false

# Preview rendered templates
./k8s/scripts/deploy.sh install --dry-run

# Custom values file
./k8s/scripts/deploy.sh install --values my-values.yaml

Architecture

Key Differences from Docker Compose

AspectDocker ComposeKubernetes
Dapr sidecarsManual -dapr containersOperator-injected via pod annotations
Dapr control planeStandalone placement + schedulerHelm-installed Dapr operator
Secrets.env file + secretstores.local.envK8s Secrets + secretstores.kubernetes
Reverse proxyTraefik container with Docker providerTraefik with IngressRoute CRDs
AutoscalingManual docker compose up --scaleKEDA ScaledObjects on RabbitMQ queue depth
Connection poolingPer-service pools onlyPgBouncer (transaction mode) between services and PostgreSQL
Service discoveryDocker DNSK8s Service DNS

What the Setup Scripts Install

Both setup-cluster-k3d.sh and setup-cluster-k3s.sh install the same Helm components:

  • Traefik (Helm chart v34.3.0) — reverse proxy with TLS termination
  • Dapr (Helm chart v1.17.7) — sidecar injection, pub/sub, workflows, secrets
  • KEDA (Helm chart v2.16.1) — event-driven autoscaling from RabbitMQ queue depth

All versions are pinned for reproducibility.

k3d vs k3s

Aspectk3dk3s
Runtimek3s inside Docker containersNative on host
Docker requiredYesNo (uses containerd)
Local image buildsVia k3d local registryVia k3s ctr images import
Traefik service typeNodePort (mapped via k3d port)LoadBalancer (built-in ServiceLB/Klipper)
Default HTTPS port74437443
Best forLocal dev, CIVMs, bare-metal, production-like

KEDA Autoscaling

KEDA monitors RabbitMQ queue depth and CPU utilization to scale services automatically:

ServiceTriggerThresholdMinMaxCooldown
file-enrichmentQueue: files-new_file10 messages1560s
document-conversionQueue: files-document_conversion_input5 messages1560s
titus-scannerQueue: titus-titus_input10 messages1560s
dotnet-serviceQueue: dotnet-dotnet_input5 messages1360s
gotenbergCPU utilization70%13120s

All thresholds are configurable in values.yaml under autoscaling.

!!! tip Queue names are created by Dapr as {consumerID}-{topic}. Verify actual queue names in the RabbitMQ management UI after first deployment and update values.yaml if they differ. Gotenberg uses CPU-based scaling (not queue-based) since it receives synchronous HTTP requests rather than consuming from a queue.

PgBouncer Connection Pooling

PgBouncer sits between all services (including Dapr sidecars) and PostgreSQL, multiplexing hundreds of client connections onto a small pool of real database connections. This prevents connection exhaustion during KEDA autoscaling when many pod replicas open connections simultaneously.

  • Pool mode: transaction — connections are returned to the pool after each transaction
  • Max client connections: 500 (configurable in values.yaml)
  • Default pool size: 20 real PostgreSQL connections

All services connect to pgbouncer:5432 instead of postgres:5432. The postgres service remains unchanged and is only accessed by PgBouncer directly.

Helm Chart Structure

k8s/helm/nemesis/
├── Chart.yaml
├── values.yaml              # All configuration
├── values-dev.yaml          # Local registry overrides (k3d)
├── values-dev-k3s.yaml      # Local image overrides (k3s)
├── files/                   # Static files (SQL, configs)
└── templates/
    ├── _helpers.tpl
    ├── namespace.yaml
    ├── secrets.yaml
    ├── configmap-*.yaml
    ├── dapr/                # Dapr CRDs (secretstore, statestore, pubsub, configs)
    ├── infra/               # PostgreSQL, RabbitMQ, SeaweedFS, Hasura
    ├── apps/                # Application deployments + services
    ├── ingress/             # Traefik IngressRoute + middleware
    ├── keda/                # KEDA ScaledObjects + TriggerAuthentication
    └── tests/               # Helm test pod

Operations

Check Status

kubectl get pods -n nemesis
kubectl get svc -n nemesis
kubectl get components.dapr.io -n nemesis
kubectl get scaledobject -n nemesis

# Or use the deploy script
./k8s/scripts/deploy.sh status

View Logs

kubectl logs -f deployment/web-api -n nemesis
kubectl logs -f deployment/file-enrichment -n nemesis
kubectl logs -f deployment/file-enrichment -c daprd -n nemesis  # Dapr sidecar

Run Helm Tests

helm test nemesis -n nemesis

Configuration

All configuration is in k8s/helm/nemesis/values.yaml. Key sections:

SectionDescription
credentials.*PostgreSQL, RabbitMQ, S3 (SeaweedFS), Hasura passwords
nemesis.*URL, log level, expiration defaults
autoscaling.*KEDA scaling thresholds and limits
fileEnrichment.*File enrichment replicas, resources, env vars
postgres.*Database image, storage, max connections
pgbouncer.*Connection pooling image, pool size, max client connections
rabbitmq.*Message queue image, storage
seaweedfs.*Object storage (SeaweedFS) image, buckets, storage

PostgreSQL Connection Tuning

All services connect through PgBouncer in transaction pooling mode, which multiplexes many client connections onto a small pool of real PostgreSQL connections. This prevents connection exhaustion during autoscaling.

Key settings (configurable in values.yaml under pgbouncer):

SettingDefaultDescription
maxClientConn2000Max simultaneous client connections PgBouncer accepts
defaultPoolSize60Real PostgreSQL connections per database
minPoolSize20Minimum idle PostgreSQL connections maintained
reservePoolSize15Extra connections when pool is exhausted
poolModetransactionReturn connections to pool after each transaction

Per-service pool settings are still configurable via environment variables:

  • DB_POOL_MAX_SIZE (default: 20) — maximum connections per pod (to PgBouncer)
  • DB_POOL_MIN_SIZE (default: 2) — minimum idle connections per pod (to PgBouncer)

With KEDA scaling to 5+ replicas across multiple services, PgBouncer keeps real PostgreSQL connections at ~60-75, well within the 300 max_connections limit.

Deferred Features

The following will be added in future updates:

  • Monitoring stack (Grafana, Prometheus, Jaeger, Loki) — toggle: monitoring.enabled
  • LLM stack (LiteLLM, Phoenix, Agents) — toggle: llm.enabled
  • Jupyter — toggle: jupyter.enabled

The values.yaml toggles exist but no templates are generated yet.

Troubleshooting

Pods stuck in CrashLoopBackOff

Check infrastructure first — app pods depend on PostgreSQL, RabbitMQ, and SeaweedFS:

kubectl logs deployment/postgres -n nemesis
kubectl logs statefulset/rabbitmq -n nemesis
kubectl logs statefulset/seaweedfs -n nemesis

Dapr sidecar not injecting

Verify the namespace label:

kubectl get ns nemesis --show-labels

Should include dapr.io/inject=true.

KEDA not scaling

Verify queue names match what Dapr created:

kubectl port-forward svc/rabbitmq 15672:15672 -n nemesis
# Open http://localhost:15672 and check queue names

Update queue names in the autoscaling.* section of values.yaml if names differ.

Connection pool exhaustion

If you see FATAL: sorry, too many clients already in PostgreSQL logs:

  1. Check PgBouncer stats: kubectl exec deployment/pgbouncer -n nemesis -- env PGPASSWORD=Qwerty12345 psql -U nemesis -h 127.0.0.1 -p 5432 pgbouncer -c "SHOW POOLS;"
  2. Check PostgreSQL connections: kubectl exec deployment/postgres -n nemesis -- psql -U nemesis -d enrichment -c "SELECT count(*) FROM pg_stat_activity;"
  3. Check per-service pool stats: port-forward to file-enrichment and hit /system/pool-stats
  4. Tune PgBouncer: increase pgbouncer.defaultPoolSize in values.yaml (adds more real PostgreSQL connections)
  5. Tune per-pod pools: reduce DB_POOL_MAX_SIZE env var to lower client connections to PgBouncer