Fork of https://github.com/alokprasad/fastspeech_squeezewave to also fix denoising in squeezewave
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

64 lines
1.6 KiB

4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
4 years ago
  1. # fastspeech_squeezewave
  2. Integration of Fastspeech Text to Mel generation and fast Vocoder Squeezewave ( CPU only).
  3. This is one of the fastest TTS solution.
  4. Code from
  5. https://github.com/xcmyz/FastSpeech
  6. https://github.com/tianrengao/SqueezeWave
  7. Put Model in Squeezewave from
  8. https://drive.google.com/file/d/1RyVMLY2l8JJGq_dCEAAd8rIRIn_k13UB/view?usp=sharing
  9. and rename it Squeezewave.pt ( select based on quality and size tradeoff)
  10. ```
  11. -rwxrwxrwx 1 root root 312M Jan 17 05:02 L128_large_pretrain
  12. -rwxrwxrwx 1 root root 97M Jan 17 05:02 L128_small_pretrain
  13. -rwxrwxrwx 1 root root 324M Jan 17 05:01 L64_large_pretrain
  14. -rwxrwxrwx 1 root root 106M Jan 17 05:03 L64_small_pretrain
  15. ```
  16. # Running Infernce
  17. 1. cd FastSpeech ; run_inference.sh
  18. 2. cd SqueezeWave ; run_inference.sh
  19. This generate wave file.
  20. # Example Run(Single CORE CPU)
  21. ( Time calculation except loading time of model)
  22. Text -->" Printing, in the only sense with which we are at present concerned, differs from most if not from all the arts and crafts represented in the Exhibition in being comparatively modern"
  23. Audio Duratio generated 11.5 Sec in arodun 3.83 seconds
  24. On X86 3.6ghz Single Core
  25. ```
  26. 07:40:00alok@/mount/data/fastspeech_squeezewave/FastSpeech$ bash run_inference.sh
  27. MEL Calculation:
  28. 2.827802896499634
  29. 07:40:37alok@/mount/data/fastspeech_squeezewave/SqueezeWave$ bash run_inference.sh
  30. ./test_synthesis.wav
  31. Squeezewave vocoder time
  32. 1.0016820430755615
  33. ```
  34. @@ On RasperryPi ( @varungujjar)
  35. ```
  36. Raspberry Pi4 4GB
  37. Model : L128_small_pretrain
  38. Fastspeech :
  39. MEL Calculation:
  40. 2.8617560863494873
  41. SqueezeWave
  42. Squeezewave vocoder time
  43. 14.423999309539795
  44. ```