O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Exploring Large Datasets Using Spark

"I never guess. It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts."
- Sir Arthur Conan Doyle

In this chapter, we will begin to perform some exploratory data analysis on the Spark dataframe we created in the previous chapter. We will learn about some specific Spark commands that will assist you in your analysis, and will discuss several ways to perform graphing and plotting.

As you go through these examples, remember that data that resides in Spark may be much larger than you are used to, and that it may be impractical to apply some quick analytic techniques without first considering how the data is organized, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required