Skip to Content
Mastering Java Machine Learning
book

Mastering Java Machine Learning

by Uday Kamath, Krishna Choppella
July 2017
Beginner to intermediate
556 pages
13h 8m
English
Packt Publishing
Content preview from Mastering Java Machine Learning

Text processing components and transformations

In this section, we will discuss some common preprocessing and transformation steps that are done in most text mining processes. The general concept is to convert the documents into structured datasets with features or attributes that most Machine Learning algorithms can use to perform different kinds of learning.

We will briefly describe some of the most used techniques in the next section. Different applications of text mining might use different pieces or variations of the components shown in the following figure:

Text processing components and transformations

Figure 10: Text Processing components and the flow

Document collection and standardization ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Machine Learning in Java - Second Edition

Machine Learning in Java - Second Edition

AshishSingh Bhatia, Bostjan Kaluza

Publisher Resources

ISBN: 9781785880513Supplemental Content