Chapter 8
Automatic Language
Identification
Jiˇrí Navrátil
Automatic Language Identification (LID) is the task of automatically
recognizing language from a spoken utterance. In view of current glob-
alization trends in communication technology, LID plays an essential part
in providing speech applications to a large, multilingual user community.
These may include multilingual spoken dialog systems (e.g., information
kiosks), spoken-document retrieval, and multimedia mining systems, as
well as human-to-human communication systems (call routing, speech-to-
speech translation). Due to the challenge posed by multiple (and possibly
unknown) input languages, interestin automatic LID has increased steadily,
and intensive research efforts by the speech technology community have
resulted in significant progress over the last two decades. This chapter sur-
veys the major approaches to LID, analyzes different solutions in terms
of their practical applicability, and concludes with an overview of current
trends and future research directions.
233

Get Multilingual Speech Processing now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.