ConTSG-Bench Specification (Single Source of Truth)
March 4, 2026 ยท View on GitHub
This document is the canonical reference for benchmark scope. Public-facing docs should reference this file instead of duplicating counts.
Last updated: 2026-03-03
Scope Summary
- Benchmark datasets: 10
- Generation models in benchmark suite: 11
- Leaderboard metrics: 15
- Fidelity: 7
- Adherence: 4
- Utility: 4
Benchmark Dataset IDs (10)
synth-msynth-uettm1weather_conceptweather_morphologytelecomts_segmentistanbul_trafficairquality_beijingptbxl_conceptptbxl_morphology
Generation Model IDs (11)
Text-conditioned
verbaltst2sbridgediffusetstext2motionretrieval
Attribute-conditioned
timeweaverwavestitchtedit
Label-conditioned
timevqvaettscgan
Leaderboard Metric IDs (15)
Fidelity (7)
acdsdkdmddfidprdc_f1.precisionprdc_f1.recall
Adherence (4)
jftsdjoint_prdc_f1.precisionjoint_prdc_f1.recallcttp
Utility (4)
dtwcrpsedwape
Ranking Policy Notes
- Overall ranking uses Fidelity + Adherence groups.
- Utility metrics are reported but excluded from overall ranking.