July 2018
Intermediate to advanced
474 pages
13h 37m
English
This section walks through the following steps to prepare the dataset for the deep learning pipeline:
mainDF = mainDF.withColumnRenamed('userId_1', 'userid')mainDF = mainDF.withColumnRenamed('movieId_1', 'movieid')mainDF = mainDF.withColumnRenamed('rating_1', 'rating')mainDF = mainDF.withColumnRenamed('timestamp_1', 'timestamp')mainDF = mainDF.withColumnRenamed('imdbId', 'imdbid')mainDF = mainDF.withColumnRenamed('tmdbId', 'tmdbid')
import pyspark.sql.functions as FmainDF = mainDF.withColumn("rating", F.round(mainDF["rating"], ...Read now
Unlock full access