O'Reilly logo

Apache Spark for Data Science Cookbook by Padma Priya Chitturi

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Visualizing data on HDFS - parameterizing inputs

Once we start the service, we can point our browser to http://localhost:8080 (change the port as per your modified port configuration) to view the Zeppelin UI. Zeppelin organizes its contents as notes and paragraphs. A note is simply a list of all the paragraphs on a single web page.

Using data from HDFS simply means that we point to the HDFS location instead of the local file system location. Before we consume the file from HDFS, let's quickly check the Spark version that Zeppelin uses. This can be achieved by issuing sc.version on a paragraph. The sc variable is an implicit variable representing the SparkContext inside Zeppelin, which simply means that we need not programmatically create a SparkContext ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required