Appendix BRecipes

B.1 Activity Tracking Recipe

B.1.1 Problem Statement

Activity tracking is an indispensable component for organizations. It provides the ability to learn about users and brings a personalized experience to them. User activity tracking can help to optimize key performance indicator (KPI) for the organization. Activity tracking involves capture user activities while the user is browsing the website or mobile app. Each activity results in one or more signals that are pushed down to the big data infrastructure. The response to these signals can be computed as they come or computed through offline processing. The response can be as simple as a score or a Boolean value that determines what the user experiences.

B.1.2 Design Approach

Computing different scores and rates for the user might require a mix of stream and batch processing. Some values can be computed on the fly. Others might require extensive batch pipelines. Considering both requirements, we divide the problem into two subproblems. The first problem is the data ingestion and persistence. The second problem is computation.

B.1.2.1 Data Ingestion

The data ingestion should support both streaming and batch processing needs. The signals should be fed into persistent storage with appropriate retention. Our proposed approach uses a messaging layer component, Kafka, to store signals for a certain amount of time in the messaging layer. At the same time, Kafka feeds object storage for long‐term persistence. ...

Get Designing Big Data Platforms now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.