Rafael Valle 9317bf68e4 tensorboard.png: adding tensorboard image		6 years ago
LICENSE	adding readme and license	6 years ago
README.md	adding readme and license	6 years ago
tensorboard.png	tensorboard.png: adding tensorboard image	6 years ago

Tacotron 2 (without wavenet)

This implementation includes distributed and fp16 support and uses the LJSpeech dataset.

Distributed and FP16 support relies on work by Christian Sarofeen and NVIDIA's frameworks team.

Pre-requisites

Download and extract the LJ Speech dataset
Clone this repo: git clone https://github.com/NVIDIA/tacotron2.git
CD into this repo: cd tacotron2
Update .wav paths: sed -i -- 's,DUMMY,ljs_dataset_folder/wavs,g' *.txt
Install pytorch 0.4
Install python requirements or use docker container (tbd)
- Install python requirements: pip install requirements.txt
- OR
- Docker container (tbd)

python -m multiproc train.py --output_directory=/outdir --log_directory=/logdir --hparams=distributed_run=True

nv-wavenet: Faster than real-time wavenet inference

This implementation is inspired or uses code from the following repos: Ryuchi Yamamoto, Keith Ito, [Prem Seetharaman](Prem Seetharaman's https://github.com/pseeth/pytorch-stft).

We are thankful to the Tacotron 2 paper authors, specially Jonathan Shen, Yuxuan Wang and Zongheng Yang.