Skip to Content
Java: Data Science Made Easy
book

Java: Data Science Made Easy

by Richard M. Reese, Jennifer L. Reese, Alexey Grigorev
July 2017
Beginner to intermediate
715 pages
17h 3m
English
Packt Publishing
Content preview from Java: Data Science Made Easy

Text Analysis

Text analysis is a broad topic and is typically referred to as Natural Language Processing (NLP). It is used for many different tasks, including text searching, language translation, sentiment analysis, speech recognition, and classification, to mention a few. The process of analyzing can be difficult due to the particularities and ambiguity found in natural languages. However, there has been a considerable amount of work in this area and there are several Java APIs supporting this effort.

We will start with an introduction to the basic concepts and tasks used in NLP. These include the following:

  • Tokenization: The process of splitting text into individual tokens or words.
  • Stop words: These are words that are common and may ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Java Data Science Cookbook

Java Data Science Cookbook

Rushdi Shams
Java for Data Science

Java for Data Science

Walter Molina, Richard M. Reese, Shilpi Saxena, Jennifer L. Reese

Publisher Resources

ISBN: 9781788475655Supplemental Content