Renderer Validation Matrix

May 15, 2026 ยท View on GitHub

This matrix is the validation source of truth for the staged GL renderer work. It separates safe automated startup/self-test coverage from gameplay smoke coverage that must be run manually with the mode-specific SP/MP launch tasks.

Build And Stage

Use the project wrapper:

tools\build\meson_setup.ps1 setup --wipe builddir . --backend ninja --buildtype=debug --wrap-mode=forcefallback
tools\build\meson_setup.ps1 compile -C builddir
tools\build\meson_setup.ps1 install -C builddir --no-rebuild --skip-subprojects

For incremental validation after an existing setup:

tools\build\meson_setup.ps1 compile -C builddir -- -j1
tools\build\meson_setup.ps1 install -C builddir --no-rebuild --skip-subprojects

Automated Safe Matrix

The safe matrix starts the staged client, runs renderer self-tests or startup probes, prints gfxInfo, then quits. It does not launch maps.

python tools\tests\renderer_validation_matrix.py

For focused validation without relaunching the full matrix, use --cases with one or more case ids:

python tools\tests\renderer_validation_matrix.py --cases renderer-default-promotion-selftest

The runner writes a timestamped report under .tmp/renderer-validation/ with per-case logs and a JSON copy for CI or release triage.

Automated coverage:

CaseCoverage
renderer-foundation-selftestscontext ladder, tier selector, tier workload contract, upload manager, GPU timer, scene packet, render graph, render graph resource owner, material resource table, geometry/instance resource records, GL state cache, Shader Library V2 pass-family/permutation/reflection coverage, draw plan, submit plan, modern executor, and shadow planner self-tests
renderer-visible-depth-selftestopt-in r_rendererModernVisibleDepth coverage for graph-backed scene depth, compatible shadow-depth resources, fallback accounting, depth-overlay readiness, and gfxInfo reporting
renderer-gbuffer-selftestopt-in r_rendererModernOpaque coverage for graph-backed G-buffer resources, MRT setup, opaque/alpha-test draw classification, diffuse texture binding, packing assumptions, fallback accounting, bandwidth metrics, attachment debug-overlay readiness, and gfxInfo reporting
renderer-cluster-grid-selftestopt-in modern clustered-light preparation coverage for point/projected/fog/ambient/special light classification, budgeted dynamic grid slicing, cluster reference packing, spill/overflow accounting, GL 3.3 UBO fallback readiness, GL 4.3+ SSBO upload readiness, cluster debug-overlay texture generation, and gfxInfo reporting
renderer-shadow-planner-selftestmodern shadow planner coverage for projected/point/CSM policy, mapped/stencil-fallback/skipped accounting, benchmark-budgeted shadow resolution/light/pixel caps, render-graph shadow resource reporting, clustered shadow descriptor integration, and gfxInfo reporting
renderer-deferred-resolve-selftestopt-in r_rendererModernDeferred coverage for graph-backed deferred resolve output, G-buffer/depth/cluster buffer inputs, point/projected light accumulation, light-grid contribution, fallback accounting, deferred debug-overlay readiness, GPU timer coverage, and gfxInfo reporting
renderer-forward-plus-selftestopt-in r_rendererForwardPlus coverage for graph-backed scene-color/depth resources, clustered opaque/alpha-test/transparent programs, clustered-light UBO/SSBO reads, transparent sort preservation, fallback accounting, overdraw estimates, GPU timer coverage, and gfxInfo reporting
renderer-modern-visible-selftestopt-in r_rendererModernVisible coverage for the guarded hybrid visible-frame bridge: graph-backed depth, G-buffer, deferred resolve, forward+ source output, graph-owned hybridSceneColor composition, HDR/post-process handoff before SSAO/bloom/authored post, depth-copy handoff accounting, shadow-ready handoff/fallback accounting, final GUI/present overlay, GPU timer coverage, and gfxInfo reporting
renderer-modern-compatibility-selftestPhase 14 modern-visible compatibility coverage for command-category ownership inventory, modern fullscreen GUI readiness, light-grid ownership, explicit post/copy/subview/render-demo/BSE fallback buckets, deterministic render-demo accounting, and gfxInfo reporting
renderer-compatibility-gates-selftestPhase 15 fallback-gate coverage for missing UBO, broken MRT, missing timer query, missing buffer storage, rejected debug-context fallback, and synthetic driver-quirk downgrades
renderer-default-promotion-selftestPhase 8 evidence-gated default-promotion coverage for r_glTier auto, explicit r_renderer arb2 escape behavior, compatibility gates, modern-executor readiness, ARB2 rollback availability, missing/incomplete/complete r_rendererPromotionEvidence, and r_rendererModernAutoPromote sign-off control
renderer-default-safety-selftestPhase 13 conservative-default coverage for ARB2 default visibility, r_renderer best or explicit r_renderer arb2, r_glTier auto, rollback availability, and default-off modern executor, visible, diagnostic, GPU-validation, bindless, shader-reload, and auto-promotion cvars
renderer-benchmark-selftestPhase 16 benchmark coverage for rolling P50/P95/P99 frame-time capture, CPU front-end/visibility/packet/graph/submit/present timings, GPU pass timing fields, upload/draw/light/cluster/fallback counters, benchmark presets, and performance-threshold reporting
renderer-gpu-driven-selftestforced r_glTier gl43 coverage for GL 4.3 SSBO submit records, compute scissor culling, clustered-bin validation, compacted indirect command generation, CPU/GPU readback comparison, masked multi-draw indirect execution, GPU timer coverage, and gfxInfo reporting
renderer-low-overhead-selftestforced r_glTier gl45 coverage for GL 4.5 DSA graph texture/FBO allocation, DSA sampler creation, named buffer/FBO updates, UBO/SSBO/texture/sampler multi-bind batches, submit-batch compaction, bindless experiment reporting, persistent upload defaults, fence diagnostics, and gfxInfo reporting
tier-autodefault compatibility-preserving startup and gfxInfo
tier-legacyforced legacy compatibility startup and gfxInfo
tier-gl33forced GL 3.3 startup and gfxInfo
tier-gl41forced GL 4.1 startup and gfxInfo
tier-gl43forced GL 4.3 GPU-driven tier startup and gfxInfo
tier-gl45forced GL 4.5 low-overhead tier startup and gfxInfo
tier-gl46forced GL 4.6 top tier startup and gfxInfo
tier-gl33-debug-contextdebug-context request with non-debug fallback available
present-vsync0-fps0unlocked presentation startup probe
present-vsync1-fps240high-refresh capped presentation startup probe
present-vsync1-fps120120 FPS capped presentation startup probe

The forced tier cases pass when startup succeeds and the selected tier is reported. If a machine cannot support the forced tier, the log must show the selected fallback tier and Renderer tier contract: must report degraded=1, failClosed=1, and a concise missing= reason.

Automated safe cases also fail if their logs contain renderer warning signatures such as idStr::snPrintf overflow, WARNING: idStr, shader compile/program link failures, or OpenGL error markers. The generated Markdown/JSON report records per-case warning-signature counts so the Phase 8 warnings=0 promotion token cannot be inferred from expected-line checks alone.

The visible-depth, G-buffer, clustered-light, deferred-resolve, forward+, modern-visible, modern-compatibility, compatibility-gates, default-promotion, default-safety, benchmark, GPU-driven, and low-overhead self-tests intentionally run as their own safe cases instead of being appended to the foundation self-test startup command, because the engine command parser has a fixed startup command list budget.

Gameplay benchmark acceptance should use wall-clock sampling for FPS claims. The --sample-msec option emits waitMsec into the generated cfg so the measurement window is a real duration rather than a frame count:

python tools\tests\renderer_gameplay_benchmark.py --profile smoke --maxfps 0 --swap-intervals 0 --display-modes fullscreen --autoexec-delay-ms 2000 --settle-frames 1 --sample-msec 3000 --pacing-only --min-pacing-hz 120 --max-p95-ms 12 --max-p99-ms 20

Compatibility Gates

rendererCompatibilityGatesSelfTest is the Phase 15 fallback-gate test. It does not need a map load; it simulates the driver/capability cases that must never promote the wrong visible path:

GateExpected behavior
missing UBOGL 3.3+ modern baseline is rejected and startup falls back to the legacy compatibility tier when fixed-function compatibility exists
broken MRTG-buffer/deferred ownership is blocked and the tier selector falls back below modern visible ownership
missing timer queryrenderer GPU timers report unavailable without downgrading an otherwise valid modern tier
missing buffer storageGL 4.5/4.6 low-overhead tier downgrades to the GL 4.3 GPU-driven tier while retaining SSBO/compute coverage
rejected debug contextthe shared context ladder proves a non-debug fallback candidate exists after debug candidates
driver quirk tableknown-bad or synthetic driver matches can mask unsafe features before tier selection so gfxInfo and renderer bootstrap agree

gfxInfo prints both Renderer driver quirks: and Renderer compatibility gates:. The quirk line records matched rules and cap changes; the gate line records selected tier, UBO/MRT/timer/buffer-storage readiness, low-overhead readiness, debug fallback, and forced-tier support.

Default Promotion Criteria

r_rendererModernAutoPromote is the sign-off switch for making the guarded modern visible path the automatic choice under r_glTier auto. Its default is 0, and the engine also requires r_rendererPromotionEvidence to contain the complete Phase 8 evidence token, so ARB2 remains the default visible renderer until the evidence below is complete. gfxInfo prints Renderer default promotion: with the current reason, evidence coverage, missing evidence fields, and Renderer default safety: with the current conservative-default audit. rendererDefaultPromotionSelfTest verifies the promotion gate logic without loading a map, while rendererDefaultSafetySelfTest verifies the clean-startup default contract.

CriterionRequired evidence
tierr_glTier auto selects a modern GL 3.3+ tier after driver quirks and compatibility gates are applied
renderer escaper_renderer best leaves promotion available; explicit r_renderer arb2 keeps the ARB2 bridge
compatibility gatesmodern baseline features, UBOs, MRT, scene packets, render graph, and Shader Library V2 readiness are available
fallback escapethe ARB2 compatibility bridge remains selectable through r_renderer arb2 and r_glTier legacy
conservative defaultsr_renderer best or explicit r_renderer arb2 keeps ARB2 visible; r_rendererModernAutoPromote, modern executor/submit/visible/pass/debug paths, GPU validation, bindless, and shader reload all remain off in a clean startup
validation evidencer_rendererPromotionEvidence carries the complete Phase 8 token after zero-warning visual, gameplay, RenderDoc, performance, presentation, rollback, and debug-off checks pass
manual sign-offr_rendererModernAutoPromote 1 is used only together with a complete r_rendererPromotionEvidence token

Required promotion token:

r_rendererPromotionEvidence "phase8=complete;warnings=0;visual=pass;gameplay=pass;renderdoc=pass;perf=arb2-or-better;presentation=pass;rollback=pass;debug=off"

Deterministic Capture Matrix

These image captures are the comparison set for scenes where deterministic output is practical. Capture paths should live under .tmp/renderer-captures/<date>/, and any checked-in references must be approved separately so the repo does not accumulate accidental binary churn.

CaseModeScenePurpose
capture-startup-mainmenuSPmain menu after logo skipdeterministic GUI composition, font/material atlas, and widescreen expansion
capture-renderer-visible-selftestsafe startuprendererModernVisibleSelfTestsynthetic modern-visible depth/G-buffer/deferred/forward+/hybrid-scene/present composition with shadow-policy handoff
capture-renderer-compatibility-selftestsafe startuprendererModernCompatibilitySelfTestknown fallback inventory for GUI/post/subview/render-demo/BSE categories
capture-sp-airdefense1-staticSPgame/airdefense1 fixed spawn, no input for 3 secondsoutdoor lighting, terrain decals, BSE smoke, and stock material parity

RenderDoc Tier Checklist

Capture with r_rendererMetrics 2, r_rendererGpuTimers 1 when available, and the matching forced tier. Every capture should show named debug scopes and object labels for graph resources, modern executor buffers/programs, and pass-owned FBOs.

Forced tierCapture focus
r_glTier gl33VAO/VBO/UBO baseline, graph resources, visible-depth/G-buffer/forward+ passes
r_glTier gl41macOS-class GLSL path and GL 4.1 context fallback behavior
r_glTier gl43SSBO scene records, compute validation dispatch, indirect-command generation
r_glTier gl45DSA texture/FBO updates, persistent upload defaults, and multi-bind groups
r_glTier gl46top-tier selection plus GL SPIR-V/bindless availability reporting without default use

Long-Run Matrix

These are manual long-run sign-off loops. They are intentionally outside the safe runner until map startup is reliable enough to automate.

CaseModePurpose
longrun-vid-restart-10xSPrepeat vid_restart ten times under r_glTier auto, gl33, and the highest supported forced tier; inspect logs after each cycle
longrun-map-transition-spSPtransition between game/airdefense1, game/storage2, and game/medlabs without restarting the process
longrun-mp-listen-reconnectMPmp/q4dm1 listen server with local client connect, disconnect, reconnect, then map restart

Performance Regression Thresholds

rendererBenchmarkCapture prints a rolling benchmark line when r_rendererMetrics is enabled. The safe matrix records the threshold table in its Markdown and JSON reports so hardware-specific performance triage can compare the same budget shape across runs. Local threshold cvars override the preset defaults for target-machine experiments.

PresetP95 targetP99 targetScreenCluster gridMaterial batchLight batchShadow budgetPost budget
low33 ms50 ms75%4x3x83216512 px / every 2 frames0
baseline20 ms28 ms100%6x4x1264321024 px / every frame1
modern16 ms24 ms100%8x6x1696641024 px / every frame2
high-end12 ms18 ms100%8x6x16128962048 px / every frame3

Manual Gameplay Matrix

Gameplay validation remains mandatory before renderer release sign-off, but it is not run by the safe matrix by default because map loads need target-hardware supervision. Use the SP launch task for single-player maps, the MP launch task or tools\debug\start_listen_server_client.ps1 for multiplayer, or the opt-in gameplay benchmark harness below when you want a repeatable logged capture set.

CaseModeMapPurpose
sp-storage1SPgame/storage1primary high-FPS renderer acceptance scene, dense indoor lighting, and early-game storage visual parity
sp-airdefense1SPgame/airdefense1stock SP baseline, outdoor lighting, BSE smoke
sp-airdefense2SPgame/airdefense2flashlight, projected shadows, animated characters
sp-storage2SPgame/storage2indoor materials and post-process coverage
sp-bse-heavySPgame/medlabsstress BSE effects without replacement content
sp-cinematic-subviewSPgame/mcc_landingsubviews, remote cameras, cinematic and GUI interaction
mp-q4dm1-listenMPmp/q4dm1listen-server and local-client MP parity

For each gameplay case, validate the matrix variants that the hardware supports:

DimensionValues
r_glTierauto, legacy, gl33, gl41, gl43, gl45, gl46
renderer escaper_renderer best, r_renderer arb2, r_glTier legacy
r_swapInterval0, 1
com_maxfps120, 240, 0
display modewindowed, fullscreen
renderer diagnosticsr_rendererMetrics 1, r_rendererMetrics 2, r_rendererModernAutoPromote 0, and one signed r_rendererModernAutoPromote 1 candidate run with the complete r_rendererPromotionEvidence token after the other rows are clean

After each gameplay smoke, inspect the configured log file under fs_savepath\<gameDir>\logs\openq4.log or the case-specific log emitted by the launch tool. Fix errors and warnings, then repeat the loop until the case is clean.

Gameplay Benchmark Harness

tools\tests\renderer_gameplay_benchmark.py is the Phase 12 map-loading runner. It launches the staged client from .install, uses isolated save paths under .tmp\renderer-gameplay\, enters SP maps or an MP listen server plus loopback client, waits for streaming, runs a fixed static spawn camera path unless a case is later extended with authored poses, captures screenshots, emits rendererBenchmarkCapture, framePacingSnapshot, and gfxInfo, and writes Markdown/JSON reports.

The runner uses the SP/MP g_autoExecAfterMapLoad hook to execute its generated cfg after the map is active, not during loading UI. Renderer metrics are enabled only inside the gameplay capture window, which keeps load-screen logs quiet while still producing benchmark samples, GPU timing where available, frame-pacing output, and a screenshot artifact.

Use --pacing-only for high-FPS acceptance after a diagnostic metrics pass is already clean. This keeps r_rendererMetrics, GL timer queries, and the FPS overlay out of the timed window, still emits framePacingSnapshot, and can fail the run with parsed thresholds such as --min-pacing-hz 120 --max-p95-ms 12. The game/storage1 acceptance run should start sampling two seconds after the active map draw with r_swapInterval 0 and com_maxfps 0 so the result measures renderer throughput rather than the old low-FPS plan cap.

Common runs:

python tools\tests\renderer_gameplay_benchmark.py --list
python tools\tests\renderer_gameplay_benchmark.py --profile smoke
python tools\tests\renderer_gameplay_benchmark.py --profile smoke --pacing-only --autoexec-delay-ms 2000 --min-pacing-hz 120 --max-p95-ms 12
python tools\tests\renderer_gameplay_benchmark.py --profile required
python tools\tests\renderer_gameplay_benchmark.py --profile tiers
python tools\tests\renderer_gameplay_benchmark.py --profile presentation
python tools\tests\renderer_gameplay_benchmark.py --profile shadows

The runner fails a case when the process times out, no gameplay screenshot is produced, the benchmark/gfxInfo lines are missing, image comparison fails when references are required, or renderer warning markers such as idStr::snPrintf: overflow, WARNING: idStr, shader compile failures, program link failures, or fatal OpenGL startup failures appear in the log.

ProfileCoverage
smokebounded game/storage1 SP gameplay smoke with screenshot, metrics, frame-pacing snapshot, and zero-warning log gates
requiredgame/storage1, game/airdefense1, game/airdefense2, game/storage2, game/medlabs, game/mcc_landing, and mp/q4dm1 listen server plus local client
tiersforced r_glTier auto, legacy, gl33, gl41, gl43, gl45, and gl46 gameplay probes
presentationr_swapInterval 0/1, com_maxfps 0/120/240, windowed, and fullscreen coverage for uncapped/high-refresh validation
shadowsstencil fallback, mapped shadows, CSM, translucent moments, and debug-overlay modes 1..6 over the shadow correctness scenes

Optional deterministic image comparison uses TGA references:

python tools\tests\renderer_gameplay_benchmark.py --profile smoke --reference-dir .tmp\renderer-references --require-references

Nondeterministic BSE, cinematic, and MP scenes need human review in addition to the automated log/screenshot gates:

CaseFocusChecks
sp-bse-heavyBSE-heavy effects in game/medlabseffect sprites/trails animate at the expected cadence, no black quads, no missing additive passes, no warning spam
sp-cinematic-subviewcinematic/subview flow in game/mcc_landingremote-camera/subview content is visible, GUI overlays composite in the right order, cinematic handoff keeps frame pacing stable
mp-q4dm1-listenlocal MP listen server plus loopback clientclient reaches the map, player/world lighting matches host expectations, frame pacing remains uncapped when requested

Shadow Correctness Matrix

CaseModeMapPurpose
shadow-projected-airdefense2SPgame/airdefense2angled projected-light caster/receiver validation
shadow-point-storage2SPgame/storage2point-light face coverage and local-light receiver validation
shadow-csm-airdefense1SPgame/airdefense1CSM camera sweep readiness and outdoor directional coverage
shadow-cutout-storage2SPgame/storage2hashed-alpha cutout fence/grate caster validation at distance
shadow-character-airdefense2SPgame/airdefense2dynamic character shadow caster and receiver validation
shadow-translucent-medlabsSPgame/medlabsoptional translucent moment caster coverage where the selected tier supports it

Acceptance

  • Automated safe matrix passes after build and install.
  • Manual gameplay matrix reaches in-game/map gameplay for every required SP/MP case on supported hardware.
  • Logs are inspected after every run.
  • No stock-asset compatibility overrides are added as a validation shortcut.
  • RenderDoc validation remains limited to forced modern/core bring-up paths until the visible renderer no longer depends on ARB2 compatibility features.
  • Benchmark captures report P50/P95/P99 frame pacing, active preset budgets, and threshold pass/fail status before any claim that the modern visible path matches or beats ARB2 on target scenes.
  • rendererDefaultSafetySelfTest and rendererDefaultPromotionSelfTest pass before any default-promotion discussion.
  • r_rendererModernAutoPromote 1 is used only with the complete r_rendererPromotionEvidence token after the default-promotion criteria pass; r_renderer arb2, r_glTier legacy, and the modern-disable cvar set remain documented rollback paths.