Skip to Main Content
Working with Text
book

Working with Text

by Emma Tonkin, Gregory J.L Tourte
July 2016
Intermediate to advanced content levelIntermediate to advanced
344 pages
10h 11m
English
Chandos Publishing
Content preview from Working with Text
Chapter 8

Automatic Language Identification

M. Zampieri*,     Saarland University, Saarbrücken, Germany German Research Center for Artificial Intelligence (DFKI), Saarbrücken, Germany

Abstract

Automatic language identification or simply language identification is the task of automatically identifying the language(s) contained in a given document. It is an important part of many text processing pipelines including text mining applications. This chapter provides a concise overview on language identification research from early approaches to state-of-the-art methods.

Keywords

Language identification

Text classification

n-grams

Acknowledgements

The author would like to thank Binyam Gebrekidan Gebre and Nikola Ljubešić for commenting on a draft ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Communicate with Teams More Effectively

Communicate with Teams More Effectively

Charles Humble

Publisher Resources

ISBN: 9781780634302