Pytorch POC #2: OpenTTS

Git Repo	Status	Progress	Comments
			Pytorch POC #2
			Pytorch POC #3
			Pytorch POC #4

Based on last time keyword spotting topics on Chimay, I even mention items about TTS (text-to-speech) and showed POCs. Here I adopt Opentts to create a API server for speech and later ultrasound generation from Web.

In live opentts demo site, you can check the conventional (non-deep learning) speech synthesis (marytts, nanotts) and deep-learning ones (Mozillatts with Tacotron and Tacotron2). Deep-learing ones provide a beeter speech quality. A public MOS test results as below also show similar conclusions.

MOS

Demo wave file as

Demo wave

Swagger API also includes the following:

The following diagram is from mozzila project. It shows the whole picture of nature lanaugege iteration with end users. But, of course, it will be a long way to go.

References

Chimay Architecture for Keyword spotting
Paper with code- speech synthesis
Diagram of mozillatts- Tacotron and Tacotron2
TSS systems and models
How to build a voice assistant with open source Rasa and Mozilla tools
The CMU Pronouncing Dictionary
Build End-To-End TTS Tacotron: Griffin Lim 信号估计算法
Mozilla 研究：有些機器合成語音已經比真人聲音更悅耳

Pytorch POC #2: OpenTTS

References

Written by Wen-Chieh-Lee