So now that we have a better idea of what's actually going on here, let's take another look at the ratings-counter script as a whole. Go back to your SparkCourse directory and open it back in Canopy:
Here it is altogether, shown as follows. It looks a bit neater in here, so once again let's review what's going on here:
from pyspark import SparkConf, SparkContext import collections conf = SparkConf().setMaster("local").setAppName("RatingsHistogram") sc = SparkContext(conf = conf) lines = sc.textFile("file:///SparkCourse/ml-100k/u.data") ratings = lines.map(lambda x: x.split()[2]) result = ratings.countByValue() ...