DiffSinger Dataset Tools

April 26, 2026 ยท View on GitHub

DiffSinger dataset processing tools for singing voice synthesis data preparation, including audio slicing, labeling, forced alignment, and audio-to-MIDI transcription.

Applications

ApplicationDescription
MinLabelAudio labeling tool with G2P conversion (Mandarin/Cantonese/Japanese)
SlurCutterDiffSinger sentence/MIDI editor with piano roll F0 visualization
AudioSlicerRMS-based automatic audio slicing with Audacity CSV marker support
LyricFALyric forced alignment using FunASR Paraformer (Chinese)
HubertFAHuBERT phoneme forced alignment with Praat TextGrid output
GameInferGAME audio-to-MIDI transcription (4-model ONNX pipeline)

Supported Platforms

  • Microsoft Windows (10 ~ 11) โ€” primary, with DirectML GPU acceleration
  • Apple macOS (11+)
  • Linux (Tested on Ubuntu)

Models

AsrModel

AsrModel

Used for LyricFA, only supports Chinese. jp&&en version(beta)

SomeModel

SomeModel

FblModel

FblModel

Currently, FoxBreatheLabeler only supports annotating breathing using TextGrid files output from SOFA (i.e. overlaying new "AP" annotations on intervals already marked as "SP").

GAME Model

Required for GameInfer. Place the model directory (containing config.json, encoder.onnx, segmenter.onnx, bd2dur.onnx, dur2bd.onnx, estimator.onnx) under <app_dir>/model/.

Build from Source

Requirements

ComponentRequirementDetailed
Qt>=6.8.0Core, Gui, Widgets, Svg, Network
Compiler>=C++17MSVC 2022, GCC, Clang
CMake>=3.17>=3.20 is recommended

Tested with Qt 6.8.3 and Qt 6.9.3. CI builds use Qt 6.9.3.

Setup Environment

You need to install Qt libraries first.

Windows

cd /D src/libs
cmake -Dep=dml -P ../../scripts/setup-onnxruntime.cmake

cd ../../
set QT_DIR=<dir> # directory `Qt6Config.cmake` locates
set Qt6_DIR=%QT_DIR%
set VCPKG_KEEP_ENV_VARS=QT_DIR;Qt6_DIR

git clone https://github.com/microsoft/vcpkg.git
cd /D vcpkg
bootstrap-vcpkg.bat

vcpkg install ^
    --x-manifest-root=../scripts/vcpkg-manifest ^
    --x-install-root=./installed ^
    --triplet=x64-windows

Unix

cd src/libs
cmake -Dep=cpu -P ../../scripts/setup-onnxruntime.cmake

cd ../../
export QT_DIR=<dir> # directory `Qt6Config.cmake` locates
export Qt6_DIR=$QT_DIR
export VCPKG_KEEP_ENV_VARS="QT_DIR;Qt6_DIR"

git clone https://github.com/microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh

./vcpkg install \
    --x-manifest-root=../scripts/vcpkg-manifest \
    --x-install-root=./installed \
    --triplet=<triplet>

# triplet:
#   Mac:   `x64-osx` or `arm64-osx`
#   Linux: `x64-linux` or `arm64-linux`

Build & Install

cmake -B build -G Ninja \
    -DCMAKE_INSTALL_PREFIX=<dir> \
    -DCMAKE_PREFIX_PATH=<dir> \
    -DCMAKE_TOOLCHAIN_FILE=vcpkg/scripts/buildsystems/vcpkg.cmake \
    -DCMAKE_BUILD_TYPE=Release

cmake --build build --target all

cmake --build build --target install

CMake Build Options

OptionDefaultDescription
BUILD_TESTSONBuild src/tests/ subdirectory (currently empty placeholder)
AUDIO_UTIL_BUILD_TESTSONBuild TestAudioUtil
GAME_INFER_BUILD_TESTSONBuild TestGame
SOME_INFER_BUILD_TESTSONBuild TestSome
RMVPE_INFER_BUILD_TESTSONBuild TestRmvpe
ONNXRUNTIME_ENABLE_DMLON (Windows)Enable DirectML GPU acceleration
ONNXRUNTIME_ENABLE_CUDAOFFEnable CUDA GPU acceleration

Build Outputs

TypeFiles
ApplicationsMinLabel.exe, SlurCutter.exe, AudioSlicer.exe, LyricFA.exe, HubertFA.exe, GameInfer.exe
Test executablesTestGame.exe, TestRmvpe.exe, TestSome.exe, TestAudioUtil.exe
Shared librariesgame-infer.dll, rmvpe-infer.dll, some-infer.dll, audio-util.dll

Libraries

Dependencies

License

This repository is licensed under the Apache 2.0 License.