Rafael Valle 1071023017 train.py: patching score_mask_value formerly inf, not concrete value, for compatibility with pytorch		6 years ago
filelists	changing structure for better organization	6 years ago
text	text/: adding Keith Itos text pre-processing	6 years ago
Dockerfile	Dockerfile: adding dockerfile	6 years ago
LICENSE	Update license such that it appears on repo fron tpage	6 years ago
README.md	README.md: describing how to load mel from disk	6 years ago
audio_processing.py	adding python files	6 years ago
data_utils.py	data_utils.py: adding support for loading mel from disk	6 years ago
distributed.py	adding python files	6 years ago
fp16_optimizer.py	adding python files	6 years ago
hparams.py	hparams.py: adding load_mel_from_disk params	6 years ago
inference.ipynb	ipynb typo	6 years ago
layers.py	adding python files	6 years ago
logger.py	adding python files	6 years ago
loss_function.py	adding python files	6 years ago
loss_scaler.py	loss_scaler.py: patching loss scaler for compatibility with current pytorch	6 years ago
model.py	model.py: mixed squeeze target. fixing	6 years ago
multiproc.py	adding python files	6 years ago
plotting_utils.py	adding python files	6 years ago
requirements.txt	requirements.txt: updating tensorflow requirements	6 years ago
stft.py	adding python files	6 years ago
tensorboard.png	tensorboard.png: adding tensorboard image	6 years ago
train.py	train.py: patching score_mask_value formerly inf, not concrete value, for compatibility with pytorch	6 years ago
utils.py	mask utils update for 0.4 cuda	6 years ago

Tacotron 2 (without wavenet)

This implementation includes distributed and fp16 support and uses the LJSpeech dataset.

Distributed and FP16 support relies on work by Christian Sarofeen and NVIDIA's Apex Library.

Pre-requisites

Download and extract the LJ Speech dataset
Clone this repo: git clone https://github.com/NVIDIA/tacotron2.git
CD into this repo: cd tacotron2
Update .wav paths: sed -i -- 's,DUMMY,ljs_dataset_folder/wavs,g' filelists/*.txt
- Alternatively, set load_mel_from_disk=True in hparams.py and update mel-spectrogram paths
Install pytorch 0.4
Install python requirements or build docker image
- Install python requirements: pip install -r requirements.txt
- OR
- Build docker image: docker build --tag tacotron2 .

python -m multiproc train.py --output_directory=outdir --log_directory=logdir --hparams=distributed_run=True,fp16_run=True

nv-wavenet: Faster than real-time wavenet inference

This implementation uses code from the following repos: Keith Ito, Prem Seetharaman as described in our code.

We are inspired by Ryuchi Yamamoto's Tacotron PyTorch implementation.

We are thankful to the Tacotron 2 paper authors, specially Jonathan Shen, Yuxuan Wang and Zongheng Yang.