Skip to Content
Jupyter Cookbook
book

Jupyter Cookbook

by Toomey, Nikhil Borkar, Nikhil Akki, Juan Tomás Oliva Ramos
April 2018
Beginner content levelBeginner
238 pages
7h 13m
English
Packt Publishing
Content preview from Jupyter Cookbook

How to do it...

We can slightly modify the previous script to produce a sorted listed as follows:

import pysparkif not 'sc' in globals(): sc = pyspark.SparkContext() text_file = sc.textFile("B09656_09_word_count.ipynb")sorted_counts = text_file.flatMap(lambda line: line.split(" ")) \ .map(lambda word: (word, 1)) \ .reduceByKey(lambda a, b: a + b) \ .sortByKey()for x in sorted_counts.collect(): print(x)

Producing the output as follows:

The list continues for every word found. Notice the descending order of occurrences and the sorting with words of the same occurrence. What Spark uses to determine word breaks does not appear to be too good. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python Cookbook, 3rd Edition

Python Cookbook, 3rd Edition

David Beazley, Brian K. Jones
Pandas 1.x Cookbook - Second Edition

Pandas 1.x Cookbook - Second Edition

Matthew Harrison, Theodore Petrou
bash Cookbook, 2nd Edition

bash Cookbook, 2nd Edition

Carl Albing, JP Vossen

Publisher Resources

ISBN: 9781788839440Supplemental Content