Chapter 21

Big Data Analytics

In this chapter, we introduce big data in all its glory and show how it expands the mission of the DW/BI system. We conclude with a comprehensive list of big data best practices.

Chapter 21 discusses the following concepts:

  • Comparison of two architectural approaches for tackling big data analytics
  • Management, architecture, modeling, and governance best practices for dealing with big data

Big Data Overview

What is big data? Its bigness is actually not the most interesting characteristic. Big data is structured, semistructured, unstructured, and raw data in many different formats, in some cases looking totally different than the clean scalar numbers and text you have stored in your data warehouses for the last 30 years. Much big data cannot be analyzed with anything that looks like SQL. But most important, big data is a paradigm shift in how you think about data assets, where you collect them, how you analyze them, and how you monetize the insights from the analysis.

The big data movement has gathered momentum as a large number of use cases have been recognized that fall into the category of big data analytics. These use cases include:

  • Search ranking
  • Ad tracking
  • Location and proximity tracking
  • Causal factor discovery
  • Social CRM
  • Document similarity testing
  • Genomics analysis
  • Cohort group discovery
  • In-flight aircraft status
  • Smart utility meters
  • Building sensors
  • Satellite image comparison
  • CAT scan comparison
  • Financial account fraud detection and intervention ...

Get The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.