Evaluation of Lithuanian Text-to-Speech Synthesizers

Pijus Kasparaitis


Text-to-speech synthesis of most popular languages is widely used for several decades, while the Lithuanian text-to-speech synthesis breakthrough occurred only in recent years. Six new Lithuanian synthetic voices appeared in 2013-2015. Therefore, there was a need to evaluate the newly created Lithuanian text-to-speech synthesizers. This paper presents a chronological review of the current Lithuanian text-to-speech synthesizers. Unit selection algorithm that was implemented in recent synthesizers SINT.AS and LIEPA and selection procedure of announcers are described in more detail because they were crucial to the synthesized voice quality. The main characteristics of synthetic voice are intelligibility and acceptability; they are assessed by involving human-listeners and the received data are processed by statistical methods. Thus, this paper will investigate nine recent Lithuanian synthetic voices (Regina, Edvardas, Aistė, Vladas, Laima, Marijus, Egidius, Aistis 2, Gintaras) and evaluate what intelligibility of a synthetic voice is actually achieved, which synthesizer is better when comparing them with each other, to give advice to potential synthetic voice application developers when choosing a synthesizer voices and finally to show synthesizer developers the most promising areas of improvement.

DOI: http://dx.doi.org/10.5755/j01.sal.0.28.15130

Full Text: PDF

Print ISSN: 1648-2824
Online ISSN: 2029-7203