Raul Puri b20765a3dc 0.4 scalar tensor padding update		6 years ago
filelists	changing structure for better organization	6 years ago
text	text/: adding Keith Itos text pre-processing	6 years ago
Dockerfile	Dockerfile: adding dockerfile	6 years ago
LICENSE	Update license such that it appears on repo fron tpage	6 years ago
README.md	README.md: updating readme to include docker setup	6 years ago
audio_processing.py	adding python files	6 years ago
data_utils.py	adding python files	6 years ago
distributed.py	adding python files	6 years ago
fp16_optimizer.py	adding python files	6 years ago
hparams.py	hparams.py: adapting to new structure, deleting filelists from main	6 years ago
inference.ipynb	inference.ipynb: updating	6 years ago
layers.py	adding python files	6 years ago
logger.py	adding python files	6 years ago
loss_function.py	adding python files	6 years ago
loss_scaler.py	adding python files	6 years ago
model.py	0.4 scalar tensor padding update	6 years ago
multiproc.py	adding python files	6 years ago
plotting_utils.py	adding python files	6 years ago
requirements.txt	requirements.txt: updating tensorflow requirements	6 years ago
stft.py	adding python files	6 years ago
tensorboard.png	tensorboard.png: adding tensorboard image	6 years ago
train.py	train.py: layout changes	6 years ago
utils.py	mask utils update for 0.4 cuda	6 years ago

Tacotron 2 (without wavenet)

This implementation includes distributed and fp16 support and uses the LJSpeech dataset.

Distributed and FP16 support relies on work by Christian Sarofeen and NVIDIA's Apex Library.

Pre-requisites

Download and extract the LJ Speech dataset
Clone this repo: git clone https://github.com/NVIDIA/tacotron2.git
CD into this repo: cd tacotron2
Update .wav paths: sed -i -- 's,DUMMY,ljs_dataset_folder/wavs,g' filelists/*.txt
Install pytorch 0.4
Install python requirements or build docker image
- Install python requirements: pip install requirements.txt
- OR
- Build docker image: docker build --tag tacotron2 .

python -m multiproc train.py --output_directory=outdir --log_directory=logdir --hparams=distributed_run=True,fp16_run=True

nv-wavenet: Faster than real-time wavenet inference

This implementation uses code from the following repos: Keith Ito, Prem Seetharaman as described in our code.

We are inspired by Ryuchi Yamamoto's Tacotron PyTorch implementation.

We are thankful to the Tacotron 2 paper authors, specially Jonathan Shen, Yuxuan Wang and Zongheng Yang.