Rafael Valle 25a267dfbb inference.ipynb: adding notebook for griffin-lim inference		6 years ago
LICENSE	adding readme and license	6 years ago
README.md	README.md: add linking to apex repo	6 years ago
inference.ipynb	inference.ipynb: adding notebook for griffin-lim inference	6 years ago
ljs_audio_text_test_filelist.txt	adding txt files	6 years ago
ljs_audio_text_train_filelist.txt	adding txt files	6 years ago
ljs_audio_text_val_filelist.txt	adding txt files	6 years ago
requirements.txt	adding txt files	6 years ago
tensorboard.png	tensorboard.png: adding tensorboard image	6 years ago

Tacotron 2 (without wavenet)

This implementation includes distributed and fp16 support and uses the LJSpeech dataset.

Distributed and FP16 support relies on work by Christian Sarofeen and NVIDIA's Apex Library.

Pre-requisites

Download and extract the LJ Speech dataset
Clone this repo: git clone https://github.com/NVIDIA/tacotron2.git
CD into this repo: cd tacotron2
Update .wav paths: sed -i -- 's,DUMMY,ljs_dataset_folder/wavs,g' *.txt
Install pytorch 0.4
Install python requirements or use docker container (tbd)
- Install python requirements: pip install requirements.txt
- OR
- Docker container (tbd)

python -m multiproc train.py --output_directory=/outdir --log_directory=/logdir --hparams=distributed_run=True

nv-wavenet: Faster than real-time wavenet inference

This implementation is inspired or uses code from the following repos: Ryuchi Yamamoto, Keith Ito, Prem Seetharaman.

We are thankful to the Tacotron 2 paper authors, specially Jonathan Shen, Yuxuan Wang and Zongheng Yang.