HPMDubbing_Vocoder

April 3, 2023 · View on GitHub

This repository is the vocoder of our model (HPMDubbing), which is used to convert the mel-spectrogram generated by our model into time-domain waveform.

Pretrained Models

We provide the pretrained models. One can download the checkpoints of generator (e.g., g_05000000) within the listed folders.

Folder Name	Sampling Rate	Hop Length	Segment Size	Win Length	Params.	Dataset	Fine-Tuned
HPM_Chem	16000 Hz	160	8000	640	55M	LibriTTS	No
HPM_V2C	22050 Hz	220	9900	880	58M	LibriTTS	No

Training

Please run

python train_V2C_HiFiGAN.py --config config_V2C_22050Hz.json

python train_hifigan_16KHz.py --config config_Chem_16KHz.json

Inference

inference.py : wav -> mel -> wav

python inference.py --checkpoint_file [Your path of checkpoint_file]

inference_e2e.py : mel -> wav

python inference_e2e.py --checkpoint_file [Your path of checkpoint_file]

tensorboard