We will use the Spark application to analyze the data. Make sure the data is unzipped into the data folder as follows:
Chapter_09|-- 2.0.x| |-- python| |-- scala|-- data
The actual code is in the scala folder, except a few graphs, which are in the python folder:
scala|-- src| |-- main| | |-- java| | |-- resources| | |-- scala| | | |-- org| | | |-- sparksamples| | | |-- ImageProcessing.scala| | | |-- Util.scala| | |-- scala-2.11| |-- test
Now that we've unzipped the data, we face a small challenge. Spark provides us with a way to read text files and custom Hadoop input data sources. However, there is no built-in functionality to allow us to read images.
Spark provides a method called wholeTextFiles, which allows ...