pytorch, pre-trained model, speech synthesis, supervised learning,

Pytorch POC #2: OpenTTS

Wen-Chieh-Lee Wen-Chieh-Lee Follow Aug 06, 2020 · 1 min read
Pytorch POC #2: OpenTTS
Share this
Git Repo Status Progress Comments
OpenTTS status progress Pytorch POC #2
mozillatts status progress Pytorch POC #3
MaryTTS status progress Pytorch POC #4

Based on last time keyword spotting topics on Chimay, I even mention items about TTS (text-to-speech) and showed POCs. Here I adopt Opentts to create a API server for speech and later ultrasound generation from Web.

Opentts

In live opentts demo site, you can check the conventional (non-deep learning) speech synthesis (marytts, nanotts) and deep-learning ones (Mozillatts with Tacotron and Tacotron2). Deep-learing ones provide a beeter speech quality. A public MOS test results as below also show similar conclusions.

MOS

Demo wave file as

Demo wave

Swagger API also includes the following:

Opentts swagger

The following diagram is from mozzila project. It shows the whole picture of nature lanaugege iteration with end users. But, of course, it will be a long way to go.

References

Wen-Chieh-Lee
Written by Wen-Chieh-Lee
Senior Software Architect