© Zubair Nabi 2016

Zubair Nabi, Pro Spark Streaming, 10.1007/978-1-4842-1479-4_8

8. Real-Time ETL and Analytics Magic

Zubair Nabi

(1)Lahore, Pakistan

When Jeff has trouble sleeping, he MapReduces sheep.

—Jeff Dean Facts

Data (big or otherwise) has been woven into the fabric of most businesses. The world is at a stage where Big Data directly drives corporate strategy. To maintain a competitive edge, most businesses try to run their analytics pipeline in near real-time. Although this captures the behavior of a large class of applications that rely on unstructured data, it is not exhaustive: a significant chunk of data sources are structured, and their analysis applications require data-warehousing capabilities. One way to handle these requirements ...

Get Pro Spark Streaming: The Zen of Real-Time Analytics Using Apache Spark now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.