CHAPTER 4Big Data Tasks

After defining the sources of big data in Chapter 2, we introduce the most important and challenging tasks and problems we want to solve. Using searching, through social media analysis, to smart grid control, today's real life systems require new approaches to handle big data. Those who neglect implementation of big data techniques will fail to solve the problems growing day by day or will have to give ground to the competition capable of disrupting the status quo.

In the following sections, we will go through some of the most challenging big data tasks in various branches of industry and science, which will prepare us to understand and dive into architectures capable of tackling those tasks later in Chapter 6. The selection of tasks is subjective and does not cover all branches of business and science, but I believe it gives a good overview of problems related to handling huge data sets in practice.

4.1 Recommender Systems

Recommender systems are one of the key e-commerce tools for increasing revenue by providing a personalized offer to its users. In 2006 Netflix was already willing to pay US$1 000 000 in a competition to predict user ratings for films. Bennett et al. [2007]. Since then several algorithms and systems have been developed and today one can choose among multiple solutions, libraries, and even Recommendation as a Service offerings.

At the time of the Netflix prize, the most important group of recommender methods were collaborative filtering ...

Get Modern Big Data Architectures now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.