Collecting data
Data collection for ASR is a challenging task for many reasons, including privacy. Consequently, open source datasets are limited in number. Importantly, these datasets may not be easy to access, may have insufficient data/speakers, or may be noisy. In this context, we decided to use two different datasets for the two use cases. For the voice-driven controlled smart light, we are using Google’s speech command datasets, and for use case two, we can scrap data from one of three popular open data sources, LibriVox, LibriSpeech ASR, corpus, voxceleb, and YouTube.
Google's speech command dataset includes 65,000 one-second long utterances of 30 short words, contributed to by thousands of different members of the public through the ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access