Evaluate a language model through perplexity

The nltk.model.ngram module in NLTK has a submodule, perplexity(text). This submodule evaluates the perplexity of a given text. Perplexity is defined as 2**Cross Entropy for the text. Perplexity defines how a probability model or probability distribution can be useful to predict a text.

The code for evaluating the perplexity of text as present in the nltk.model.ngram module is as follows:

def perplexity(self, text):
"""
        Calculates the perplexity of the given text.
        This is simply 2 ** cross-entropy for the text.

        :param text: words to calculate perplexity of
        :type text: list(str)
"""

        return pow(2.0, self.entropy(text))

Get Natural Language Processing: Python and NLTK now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Natural Language Processing: Python and NLTK by Nitin Hardeniya, Jacob Perkins, Deepti Chopra, Nisheeth Joshi, Iti Mathur

Evaluate a language model through perplexity

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly