Skip to Content
Java: Data Science Made Easy
book

Java: Data Science Made Easy

by Richard M. Reese, Jennifer L. Reese, Alexey Grigorev
July 2017
Beginner to intermediate
715 pages
17h 3m
English
Packt Publishing
Content preview from Java: Data Science Made Easy

Implementing named entity recognition

This is sometimes referred to as finding people and things. Given a text segment, we may want to identify all the names of people present. However, this is not always easy because a name such as Rob may also be used as a verb.

 

In this section, we will demonstrate how to use OpenNLP's TokenNameFinderModel class to find names and locations in text. While there are other entities we may want to find, this example will demonstrate the basics of the technique. We begin with names.

Most names occur within a single line. We do not want to use multiple lines because an entity such as a state might inadvertently be identified incorrectly. Consider the following sentences:

Jim headed north. Dakota headed south. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Java Data Science Cookbook

Java Data Science Cookbook

Rushdi Shams
Java for Data Science

Java for Data Science

Walter Molina, Richard M. Reese, Shilpi Saxena, Jennifer L. Reese

Publisher Resources

ISBN: 9781788475655Supplemental Content