Storing a frequency distribution in Redis
The nltk.probability.FreqDist
class is used in many classes throughout NLTK for storing and managing frequency distributions. It's quite useful, but it's all in-memory, and doesn't provide a way to persist the data. A single FreqDist
is also not accessible to multiple processes. We can change all that by building a FreqDist
on top of Redis.
Redis is a data structure server that is one of the more popular NoSQL databases. Among other things, it provides a network-accessible database for storing dictionaries (also known as hash maps). Building a FreqDist
interface to a Redis hash map will allow us to create a persistent FreqDist
that is accessible to multiple local and remote processes at the same time.
Get Python 3 Text Processing with NLTK 3 Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.