Chapter 8
Automatic Language
Identification
Jiˇrí Navrátil
Automatic Language Identification (LID) is the task of automatically
recognizing language from a spoken utterance. In view of current glob-
alization trends in communication technology, LID plays an essential part
in providing speech applications to a large, multilingual user community.
These may include multilingual spoken dialog systems (e.g., information
kiosks), spoken-document retrieval, and multimedia mining systems, as
well as human-to-human communication systems (call routing, speech-to-
speech translation). Due to the challenge posed by multiple (and possibly
unknown) input languages, interestin automatic LID has increased steadily,
and intensive research efforts by the speech technology community have
resulted in significant progress over the last two decades. This chapter sur-
veys the major approaches to LID, analyzes different solutions in terms
of their practical applicability, and concludes with an overview of current
trends and future research directions.
233

Get Multilingual Speech Processing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.