Case Study: Training a Recommender System in PySpark

To close this chapter, let us look at an example of how we might generate a large-scale recommendation system using dimensionality reduction. The dataset we will work with comes from a set of user transactions from an online store (Chen, Daqing, Sai Laing Sain, and Kun Guo. Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining. Journal of Database Marketing & Customer Strategy Management 19.3 (2012): 197-208). In this model, we will input a matrix in which the rows are users and the columns represent items in the catalog of an e-commerce site. Items purchased by a user are indicated by a 1. Our goal is to factorize this matrix into ...

Get Mastering Predictive Analytics with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.