O'Reilly logo

Scala Machine Learning Projects by Md. Rezaul Karim

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Step 3 - Explore and query for related statistics

Let's check the ratings-related statistics. Just use the following code lines:

val numRatings = ratingsDF.count()val numUsers = ratingsDF.select(ratingsDF.col("userId")).distinct().count()val numMovies = ratingsDF.select(ratingsDF.col("movieId")).distinct().count() 
println("Got " + numRatings + " ratings from " + numUsers + " users on " + numMovies + " movies.") >>>Got 105339 ratings from 668 users on 10325 movies.

You should find 105,339 ratings from 668 users on 10,325 movies. Now, let's get the maximum and minimum ratings along with the count of users who have rated a movie. However, you need to perform an SQL query on the rating table we just created in memory in the previous step. Making ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required