O'Reilly logo

Apache Spark Deep Learning Cookbook by Amrith Ravindra, Ahmed Sherif

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

How it works...

The following section explains the techniques used and insights gained from exploratory data analysis.

  1. The date column in the dataframe is more of a date-time column with the time values all ending in 00:00:00. This is unnecessary for what we will need during our modeling and therefore can be removed from the dataset. Luckily for us, PySpark has a to_date function that can do this quite easily. The dataframe, df, is transformed using the withColumn() function and now only shows the date column without the timestamp, as seen in the following screenshot:
  1. For analysis purposes, we want to extract the day, month, and year from ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required