Chapter 6. Big Data Processing Concepts

Image

Parallel Data Processing

Distributed Data Processing

Hadoop

Processing Workloads

Cluster

Processing in Batch Mode

Processing in Realtime Mode

The need to process large volumes of data is not new. When considering the relationship between a data warehouse and its associated data marts, it becomes clear that partitioning a large dataset into a smaller one can speed up processing. Big Data datasets stored on distributed file systems or within a distributed database are already partitioned into smaller datasets. The key to understanding Big Data processing is the realization that unlike the centralized processing, ...

Get Big Data Fundamentals: Concepts, Drivers & Techniques now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.