Apollo Training Data Preprocessing
March 7, 2025 · View on GitHub
This repository contains code for preprocessing training data for the Apollo audio processing model. It includes two scripts for handling different datasets: moisesdb_preprocess.py and musdb_preprocess.py.
Overview
The preprocessing code performs the following steps:
- Loads audio files
- Applies Voice Activity Detection (VAD) to filter out silent segments
- Splits the audio into fixed-length sessions
- Stores the processed data in HDF5 format for efficient training
Usage
MoisesDB Preprocessing
To preprocess the MoisesDB dataset:
python moisesdb_preprocess.py
By default, this will:
- Read data from
moisesdb/moisesdb_v0.1 - Output processed data to
musdb18hq-moises-hdf5 - Sample rate is set to 44.1kHz
- Session length is 6 seconds
You can modify the default paths and parameters in the script as needed.
MUSDB Preprocessing
To preprocess the MUSDB18-HQ dataset:
python musdb_preprocess.py
By default, this will:
- Read training data from
musdb18hq/train - Read test data from
musdb18hq/test - Output processed data to
musdb18hq-moises-hdf5 - Sample rate is set to 44.1kHz
- Session length is 6 seconds
You can modify the default paths and parameters in the script as needed.
Implementation Details
Both scripts implement a VAD (Voice Activity Detection) function that:
- Takes in audio data
- Calculates power thresholds
- Identifies segments where audio is present (not silent)
- Extracts valid audio segments
- Returns segments that have significant audio content
The processed data is stored in an HDF5 format, with each track organized in a separate directory.
License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Contact
If you have any questions or suggestions, please feel free to contact me at
- Email: tsinghua.kaili@gmail.com
Citation
If you find this code useful, please consider citing the following paper:
@inproceedings{li2025apollo,
title={Apollo: Band-sequence Modeling for High-Quality Music Restoration in Compressed Audio},
author={Li, Kai and Luo, Yi},
booktitle={IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
year={2025},
organization={IEEE}
}