Usage of VERSA in the Community
September 10, 2025 · View on GitHub
| Category | Title / Project | Venue / Year | How VERSA Was Used | Reference |
|---|---|---|---|---|
| Toolkits | ESPnet (TTS recipes) | Toolkit docs | “TTS eval using VERSA” section in official docs | ESPnet TTS Docs |
| ESPnet-Codec | SLT2024 | Released together with VERSA for codec evaluation | Paper PDF · GitHub | |
| ESPnet-SpeechLM | NAACL2025 | Tables report results from VERSA evaluations | Report PDF | |
| ESPnet-SDS: Unified toolkit and demo for spoken dialogue systems | NAACL2025 | Using VERSA for speech synthesis evaluation | Paper PDF | |
| Papers | TITW: Text-To-Speech in the Wild | Interspeech2025 | Used VERSA for MCD, UTMOS, DNSMOS, WER | Paper PDF |
| TTSDS2: Benchmark for Human-Quality TTS | SSW2025 | “We use the VERSA toolkit for all compared objective metrics” | Paper PDF | |
| Scaling Zero-Shot TTS with Speaker-Agnostic Training | OpenReview 2025 | “We use VERSA … to calculate each metric” | OpenReview | |
| Multi-stage Speech-Prompted Singing Voice Conversion | Interspeech2025 | Objective evals via VERSA (CER, speaker sim., etc.) | Paper PDF | |
| Uni-VERSA: Versatile Speech Assessment with a Unified Network | Interspeech2025 | Using VERSA to create data | Paper PDF | |
| Preference alignment improves language model-based TTS | ICASSP 2025 | Using VERSA to construct preference pairs | Paper PDF | |
| Discrete Audio Tokens: More Than a Survey! | JMLR (2025) | Using VERSA for codec reconstruction evaluation | Paper PDF | |
| Chain-of-Thought Training for Open E2E Spoken Dialogue Systems | Interspeech2025 | Using VERSA for speech synthesis evalaution | Paper PDF | |
| MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling | Interspeech2025 | Using VERSA for expressive speech synthesis evaluation | Paper PDF | |
| Scalable Spontaneous Speech Dataset (SSSD): Crowdsourcing Data Collection to Promote Dialogue Research | Interspeech2025 | Using VERSA to analyze data quality and statistics | Paper PDF | |
| OpusLM: A Family of Open Unified Speech Language Models | Interspeech2025 | Using VERSA for speech synthesis evaluation | Paper PDF | |
| Benchmarks / Challenges | Uni-VERSA Benchmark | Interspeech2025 | Annotated dataset using VERSA (default config) | Paper PDF |
| URGENT Challenge 2026 | ICASSP 2026 | Track 2 rules: evaluation derivatives over VERSA | URGENT Rules | |
| CHiME-9 Challenge | ChiME 2025 | Task 2 rules reference VERSA-style evaluation pipeline | CHiME-9 Rules | |
| LRAC Challenge | ICASSP 2026 | Rules build on VERSA metrics for low-resource audio codec evaluation | LRAC Challenge | |
| SVCC Challenge | 2025/26 | Expected adoption of VERSA for objective eval (not yet official announced) | SVCC Site |