Chapter 4. System Design for Recommending
Now that you have a foundational understanding of how recommendation systems work, let’s take a closer look at the elements needed and at designing a system that is capable of serving recommendations at industrial scale. Industrial scale in our context will primarily refer to reasonable scale (a term introduced by Ciro Greco, Andrea Polonioli, and Jacopo Tagliabue in “ML and MLOps at a Reasonable Scale”)—production applications for companies with tens to hundreds of engineers working on the product, not thousands.
In theory, a recommendation system is a collection of math formulas that can take historical data about user-item interactions and return probability estimates for a user-item-pair’s affinity. In practice, a recommendation system is 5, 10, or maybe 20 software systems, communicating in real time and working with limited information, restricted item availability, and perpetually out-of-sample behavior, all to ensure that the user sees something.
This chapter is heavily influenced by “System Design for Recommendations and Search” by Eugene Yan and “Recommender Systems, Not Just Recommender Models” by Even Oldridge and Karl Byleen-Higley.
Online Versus Offline
ML systems consist of the stuff that you do in advance and the stuff that you do on the fly. This division, between online and offline, is a practical consideration about the information necessary to perform tasks of various types. To observe and learn large-scale patterns, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access