O'Reilly logo

Mastering Data Mining with Python – Find patterns hidden in your data by Megan Squire

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Named entity recognition project

In this set of small projects, we will try our NER techniques on a variety of different types of text that we have seen already in prior chapters, as well as some new text. For variety, will look for named entities in e-mail texts, board meeting minutes, IRC chat dialogue, and human-created summaries of IRC chat dialogue. With these different types of data sources, we will be able to see how writing style and content both affect the accuracy of the NER system.

A simple NER tool

Our first step is to write a simple named entity recognition program that will allow us to find and extract named entities from a text sample. We will take this program and point it at several different text samples in turn. The code and text ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required