Skip to Main Content
PySpark Cookbook
book

PySpark Cookbook

by Denny Lee, Tomasz Drabas
June 2018
Intermediate to advanced content levelIntermediate to advanced
330 pages
9h 47m
English
Packt Publishing
Content preview from PySpark Cookbook

How it works...

As with the previous recipes, we will first specify where we are going to download the Spark binaries from and create all the relevant global variables we are going to use later. 

Next, we read in the hosts.txt file:

function readIPs() { input="./hosts.txt"
 driver=0 executors=0 _executors=""  IFS='' while read line do
 if [[ "$master" = "1" ]]; then    _driverNode="$line"    driver=0 fi
 if [[ "$slaves" = "1" ]]; then   _executors=$_executors"$line\n" fi
 if [[ "$line" = "driver:" ]]; then    driver=1    executors=0 fi
 if [[ "$line" = "executors:" ]]; then    executors=1    driver=0 fi
 if [[ -z "${line}" ]]; then     continue fi done < "$input"}

We store the path to the file in the input variable. The driver and the executors variables are flags ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Spark: The Definitive Guide

Spark: The Definitive Guide

Bill Chambers, Matei Zaharia
Learning PySpark

Learning PySpark

Tomasz Drabas, Denny Lee
Python Cookbook, 3rd Edition

Python Cookbook, 3rd Edition

David Beazley, Brian K. Jones
Kafka: The Definitive Guide

Kafka: The Definitive Guide

Neha Narkhede, Gwen Shapira, Todd Palino

Publisher Resources

ISBN: 9781788835367Supplemental Content