DeepFake Cartoon Voices — Fifteen.ai is a text-to-speech tool that you can use to generate 44.1 kHz voices of various characters. The voices are generated in real time using multiple audio synthesis algorithms and customized deep neural networks trained on very little available data (between 55 seconds and 120 minutes of clean dialogue for each character). This project demonstrates a significant reduction in the amount of audio required to realistically clone voices while retaining their affective prosodies.
Learn faster. Dig deeper. See farther.
Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.