April 2018
Beginner
238 pages
7h 13m
English
The program reads all the entries for the log file. Once loaded into an RDD textfile, we use standard operations on the RDD to extract our interests. The filter function takes a lambda expression and will return true only if the expression is true. In our case, we are looking for lines that have GET or POST in them. It was not necessary to store these separately. The same results could have been produced completely inline, for example:
posts = textFile.filter(lambda line: "POST" in line).count()