Similarity measures

There are many similarity measures that can be used for performing NLP tasks. The nltk.metrics package in NLTK is used to provide various evaluation or similarity measures, which is conducive to perform various NLP tasks.

In order to test the performance of taggers, chunkers, and so on, in NLP, the standard scores retrieved from information retrieval can be used.

Let's have a look at how the output of named entity recognizer can be analyzed using the standard scores obtained from a training file:

>>> from __future__ import print_function >>> from nltk.metrics import * >>> training='PERSON OTHER PERSON OTHER OTHER ORGANIZATION'.split() >>> testing='PERSON OTHER OTHER OTHER OTHER OTHER'.split() >>> print(accuracy(training,testing)) ...

Get Natural Language Processing: Python and NLTK now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Natural Language Processing: Python and NLTK by Nitin Hardeniya, Jacob Perkins, Deepti Chopra, Nisheeth Joshi, Iti Mathur

Similarity measures

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly