From 1874b9a08f51bb18956bbfeaf3225ae3c86942b9 Mon Sep 17 00:00:00 2001 From: Rafael Valle Date: Thu, 3 May 2018 15:10:51 -0700 Subject: [PATCH] adding readme and license --- LICENSE | 25 +++++++++++++++++++++++++ README.md | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 78 insertions(+) create mode 100644 LICENSE create mode 100644 README.md diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..8d2301c --- /dev/null +++ b/LICENSE @@ -0,0 +1,25 @@ +# Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# * Neither the name of NVIDIA CORPORATION nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY +# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR +# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, +# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, +# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR +# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY +# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. diff --git a/README.md b/README.md new file mode 100644 index 0000000..269a1e3 --- /dev/null +++ b/README.md @@ -0,0 +1,53 @@ +# Tacotron 2 (without wavenet) + +Tacotron 2 PyTorch implementation of [Natural TTS Synthesis By Conditioning +Wavenet On Mel Spectrogram Predictions](https://arxiv.org/pdf/1712.05884.pdf). + +This implementation includes **distributed** and **fp16** support +and uses the [LJSpeech dataset](https://keithito.com/LJ-Speech-Dataset/). + +Distributed and FP16 support relies on work by Christian Sarofeen and NVIDIA's +frameworks team. + +![Alignment, Predicted Mel Spectrogram, Target Mel Spectrogram](tensorboard.png) + + +## Pre-requisites +1. NVIDIA GPU + CUDA cuDNN + +## Setup +1. Download and extract the [LJ Speech dataset](https://keithito.com/LJ-Speech-Dataset/) +2. Clone this repo: `git clone https://github.com/NVIDIA/tacotron2.git` +3. CD into this repo: `cd tacotron2` +4. Update .wav paths: `sed -i -- 's,DUMMY,ljs_dataset_folder/wavs,g' *.txt` +5. Install [pytorch 0.4](https://github.com/pytorch/pytorch) +6. Install python requirements or use docker container (tbd) + - Install python requirements: `pip install requirements.txt` + - **OR** + - Docker container `(tbd)` + +## Training +1. `python train.py --output_directory=outdir --log_directory=logdir` +2. (OPTIONAL) `tensorboard --logdir=outdir/logdir` + +## Multi-GPU (distributed) and FP16 Training +1. `python -m multiproc train.py --output_directory=/outdir --log_directory=/logdir --hparams=distributed_run=True` + +## Inference +1. `jupyter notebook --ip=127.0.0.1 --port=31337` +2. load inference.ipynb + +## Related repos +[nv-wavenet](https://github.com/NVIDIA/nv-wavenet/): Faster than real-time +wavenet inference + +## Acknowledgements +This implementation is inspired or uses code from the following repos: +[Ryuchi Yamamoto](github.com/r9y9/tacotron_pytorch), [Keith +Ito](https://github.com/keithito/tacotron/), [Prem Seetharaman](Prem +Seetharaman's https://github.com/pseeth/pytorch-stft). + +We are thankful to the Tacotron 2 paper authors, specially Jonathan Shen, +Yuxuan Wang and Zongheng Yang. + +