Chapter 6. Big Data Processing Concepts

Image

Parallel Data Processing

Distributed Data Processing

Hadoop

Processing Workloads

Cluster

Processing in Batch Mode

Processing in Realtime Mode

The need to process large volumes of data is not new. When considering the relationship between a data warehouse and its associated data marts, it becomes clear that partitioning a large dataset into a smaller one can speed up processing. Big Data datasets stored on distributed file systems or within a distributed database are already partitioned into smaller datasets. The key to understanding Big Data processing is the realization that unlike the centralized processing, ...

Get Big Data Fundamentals: Concepts, Drivers & Techniques now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.