O'Reilly logo

Big Data: Principles and best practices of scalable realtime data systems by Nathan Marz with James Warren

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 2. Data model for Big Data

This chapter covers

  • Properties of data
  • The fact-based data model
  • Benefits of a fact-based model for Big Data
  • Graph schemas

In the last chapter you saw what can go wrong when using traditional tools for building data systems, and we went back to first principles to derive a better design. You saw that every data system can be formulated as computing functions on data, and you learned the basics of the Lambda Architecture, which provides a practical way to implement an arbitrary function on arbitrary data in real time.

At the core of the Lambda Architecture is the master dataset, which is highlighted in figure 2.1. The master dataset is the source of truth in the Lambda Architecture. Even if you were ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required