Distributed word scoring with Redis and execnet
We can use Redis
and execnet
together to do distributed word scoring. In the Calculating high information words recipe in Chapter 7, Text Classification, we calculated the information gain of each word in the movie_reviews
corpus using a FreqDist
and ConditionalFreqDist
. Now that we have Redis
, we can do the same thing using a RedisHashFreqDist
and a RedisConditionalHashFreqDist
, and then store the scores in a RedisOrderedDict
. We can use execnet
to distribute the counting in order to get a better performance out of Redis
.
Getting ready
Redis
, redis-py
, and execnet
must be installed, and an instance of redis-server
must be running on localhost.
How to do it...
We start by getting a list of (label, words) ...
Get Python 3 Text Processing with NLTK 3 Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.