Skip to Content
Machine Learning: End-to-End guide for Java developers
book

Machine Learning: End-to-End guide for Java developers

by Richard M. Reese, Jennifer L. Reese, Boštjan Kaluža, Dr. Uday Kamath, Krishna Choppella
October 2017
Intermediate to advanced
1159 pages
26h 10m
English
Packt Publishing
Content preview from Machine Learning: End-to-End guide for Java developers

Issues with mining unstructured data

Humans can read, parse, and understand unstructured text/documents more easily than computer-based programs. Some of the reasons why text mining is more complicated than general supervised or unsupervised learning are given here:

  • Ambiguity in terms and phrases. The word bank has multiple meanings, which a human reader can correctly associate based on context, yet this requires preprocessing steps such as POS tagging and word sense disambiguation, as we have seen. According to the Oxford English Dictionary, the word run has no fewer than 645 different uses in the verb form alone and we can see that such words can indeed present problems in resolving the meaning intended (between them, the words run, put, set, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

DevOps Tools for Java Developers

DevOps Tools for Java Developers

Stephen Chin, Melissa McKay, Ixchel Ruiz, Baruch Sadogursky

Publisher Resources

ISBN: 9781788622219Supplemental Content