O'Reilly logo

Machine Learning with Spark - Second Edition by Nick Pentreath, Manpreet Singh Ghotra, Rajdeep Dua

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Exploring the face data

We will use the Spark application to analyze the data. Make sure the data is unzipped into the data folder as follows:

Chapter_09|-- 2.0.x|   |-- python|   |-- scala|-- data

The actual code is in the scala folder, except a few graphs, which are in the python folder:

scala|-- src|   |-- main|   |   |-- java|   |   |-- resources|   |   |-- scala|   |   |   |-- org|   |   |       |-- sparksamples|   |   |           |-- ImageProcessing.scala|   |   |           |-- Util.scala|   |   |-- scala-2.11|   |-- test

Now that we've unzipped the data, we face a small challenge. Spark provides us with a way to read text files and custom Hadoop input data sources. However, there is no built-in functionality to allow us to read images.

Spark provides a method called wholeTextFiles, which allows ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required