ASR system model
An Automatic Speech Recognition (ASR) system needs three main sources of knowledge. These sources are known as an acoustic model, a phonetic lexicon, and a language model [4]. Generally, an acoustic model deals with the sounds of language, including the phonemes and extra sounds (such as pauses, breathing, background noise, and so on). On the other hand, a phonetic lexicon model or dictionary includes the words that can be understood by the system, with their possible pronunciations. Finally, a language model includes knowledge about the potential word sequences of a language. In recent years, DL approaches have been extensively used in acoustic and language models of ASR.
The following diagram presents a system model for ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access