IndexTTS Evaluation Results

December 26, 2025 · View on GitHub

Model: IndexTTS2 Evaluation Date: 2025/12/08 Paper/Repo: IndexTeam/IndexTTS-2

Metrics Legend:

  • WER⬇️: Word Error Rate (lower is better)
  • CER⬇️: Character Error Rate (lower is better)
  • SIM⬆️: Speaker Similarity (higher is better)

Seed-TTS-Eval Benchmark

IndexTTS2

taskdatasetWER/CER⬇️SIM⬆️eval_clinote
ttsseed_tts_eval_en2.08(1.52)70.40[3]
ttsseed_tts_eval_zh1.04(1.01)76.04[4]

CV3 Benchmark (Zero-Shot)

IndexTTS2

taskdatasetWER/CER⬇️SIM⬆️eval_clinote
ttscv3_zero_shot_en4.3273.60[9]
ttscv3_zero_shot_zh3.6378.00[10]
ttscv3_zero_shot_hard_en8.5974.01[11]
ttscv3_zero_shot_hard_zh8.7177.21[12]

Evaluation Commands

IndexTTS2

[3] python audio_evals/main.py --dataset seed_tts_eval_en --model indextts2

[4] python audio_evals/main.py --dataset seed_tts_eval_zh --model indextts2

[9] python audio_evals/main.py --dataset cv3_zero_shot_en --model indextts2

[10] python audio_evals/main.py --dataset cv3_zero_shot_zh --model indextts2

[11] python audio_evals/main.py --dataset cv3_zero_shot_hard_en --model indextts2

[12] python audio_evals/main.py --dataset cv3_zero_shot_hard_zh --model indextts2