- Start a new project in IntelliJ or in an IDE of your choice. Make sure the necessary JAR files are included.
- We define the package information for the Scala program:
package spark.ml.cookbook.chapter7
- Import the necessary packages:
import org.apache.log4j.{Level, Logger} import org.apache.spark.sql.SparkSession import org.apache.spark.ml.recommendation.ALS
- We now define two Scala case classes, to model movie and ratings data:
case class Movie(movieId: Int, title: String, year: Int, genre: Seq[String]) case class FullRating(userId: Int, movieId: Int, rating: Float, timestamp: Long)
- In this step, we define functions for parsing a single line of data from the ratings.dat file into the ratings case class, and for parsing ...