July 2018
Beginner to intermediate
312 pages
8h 31m
English
A subjective measure of sound quality, the mean opinion score (MOS), is one of the most commonly used tests for assessing the performance of a TTS algorithm. Usually, several native speakers are asked to give a score of naturalness, from 1 (bad quality) to 5 (excellent quality), and the mean of those scores is the MOS. Audio samples recorded by professionals typically have an MOS of around 4.55, as shown in the WaveNet: A Generative Model for Raw Audio paper that will be presented later in this chapter (https://arxiv.org/abs/1609.03499).
This way of benchmarking TTS algorithms is not entirely satisfactory, however. For instance, it does not allow for a rigorous comparison of different algorithms ...
Read now
Unlock full access