Many real-life problems can be solved using single machines, with enough computational power; however, in some cases, the amount of data is so large that it is impossible to perform in-memory operations. We have seen an example of such a scenario in Chapter 12, Introducing Recommendation Systems, when discussing the Alternating Least Squares (ALS) strategy using Spark. In that case, the user-product matrix can become extremely large, and the only way to solve the problem is to employ distributed architectures. A generic schema is shown in the following diagram:
A brief introduction to distributed architectures
Generic distributed architecture ...
Get Machine Learning Algorithms - Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.