July 2018
Beginner to intermediate
312 pages
8h 31m
English
We will use the LJ Speech dataset for this task (https://keithito.com/LJ-Speech-Dataset/). It contains 13,100 .wav recordings with their corresponding transcripts. The transcripts are available in both their raw and normalized formats. In the normalized version of a transcript, numbers are written in full words.
The recordings were produced with the same voice. The total length of the audio content is roughly 24 hours, with samples that can last from 1 to 7 seconds. This dataset is in the public domain, and there are no restrictions on its use.
The dataset folder contains a CSV file ...
Read now
Unlock full access