Part II. Retrieval
How do we get all the data in the right place to train a recommendation system? How do we build and deploy systems for real-time inference?
Reading research papers about recommendation systems will often give the impression that they’re built via a bunch of math equations, and all the really hard work of using recommendation systems is in connecting these equations to the features of your problem. More realistically, the first several steps of building a production recommendation system fall under systems engineering. Understanding how your data will make it into your system, be manipulated into the correct structure, and then be available in each of the relevant steps of the training flow often constitutes the bulk of the initial recommendation system’s work. But even beyond this initial phase, ensuring that all the necessary components are fast enough and robust enough for production environments requires yet another significant investment in platform infrastructure.
Often you’ll build a component responsible for processing the various types of data and storing them in a convenient format. Next, you’ll construct a model that takes that data and encodes it in a latent space or other representation model. Finally, you’ll need to transform an input request into the representation as a query in this space. These steps usually take the form of jobs in a workflow management platform or services deployed as endpoints. The next few chapters will walk you through ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access