Singular value decomposition and principal components analysis

It is quite common to have a dataset where the number of users and items number in the millions. Even if the rating matrix is not that large, it may be beneficial to reduce the dimensionality by creating a smaller (lower-rank) matrix that captures most of the information in the higher-dimension matrix. This may potentially allow you to capture important latent factors and their corresponding weights in the data. Such factors could lead to important insights, such as the movie genre or book topics in the rating matrix. Even if you are unable to discern meaningful factors, the techniques may filter out the noise in the data.

One issue with large datasets is that you will likely ...

Get Mastering Machine Learning with R - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.