O'Reilly logo

Big Data Fundamentals: Concepts, Drivers & Techniques by Paul Buhler, Wajid Khattak, Thomas Erl

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 5. Big Data Storage Concepts

Image

Clusters

File Systems and Distributed File Systems

NoSQL

Sharding

Replication

Sharding and Replication

CAP Theorem

ACID

BASE

Data acquired from external sources is often not in a format or structure that can be directly processed. To overcome these incompatibilities and prepare data for storage and processing, data wrangling is necessary. Data wrangling includes steps to filter, cleanse and otherwise prepare the data for downstream analysis. From a storage perspective, a copy of the data is first stored in its acquired format, and, after wrangling, the prepared data needs to be stored again. Typically, storage ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required