Multiband-WaveRNN
Pytorch implementation of MultiBand-WaveRNN model from Efficient Neural Audio Synthesis DURATION INFORMED ATTENTION NETWORK FOR MULTIMODAL SYNTHESIS
Issues
RAW mode, Unbatched generation supported. Welcome for your contribution to implement MOL mode.
Installation
Ensure you have:
- Python >= 3.6
- Pytorch 1 with CUDA
Then install the rest with pip:
pip install -r requirements.txt
How to Use
Training your own Models
Download the LJSpeech Dataset.
Edit hparams.py, point wav_path to your dataset and run:
python preprocess.py
or use preprocess.py –path to point directly to the dataset ___
Here’s my recommendation on what order to run things:
1 - Train WaveRNN with:
python train_wavernn.py
2 - Generate Sentences with both models using:
python gen_wavernn.py
Speech
Mandarin
Speaker | Recording | WaveRNN | Parallel WaveGAN | FB MelGAN | SingVocoder |
---|---|---|---|---|---|
#1 | |||||
#2 | |||||
#3 | |||||
#4 | |||||
#5 |
English
Speaker | Recording | WaveRNN | Parallel WaveGAN | FB MelGAN | SingVocoder |
---|---|---|---|---|---|
#1 | |||||
#2 | |||||
#3 | |||||
#4 | |||||
#5 |