CHAPTER 2

Demystifying Big Data

I know it when I see it.

—Supreme Court Justice Potter Stewart on his threshold test for possible obscenity. Concurring opinion, Jacobellis v. Ohio (1964)

The previous quote comes from perhaps the most famous of all U.S. Supreme Court cases. Stewart’s line “I know it when I see it” long ago entered the vernacular, at least in the United States. It’s been applied to myriad different scenarios. Those seven words illustrate a number of things, not the least of which is the difficulty that even really smart people have in defining ostensibly simple terms.

Fast forward nearly fifty years, and many learned folks are having the same issue with respect to Big Data. Just what the heck is it, anyway? Much like the term cloud computing, you can search in vain for weeks for the “perfect” definition of Big Data. Douglas Laney (then with the META group, now with Gartner) fired the first shot in late 2001. Laney wrote about the growth challenges and opportunities facing organizations with respect to increasing amounts of data.1 Years before the term Big Data was de rigueur, Laney defined three primary dimensions of the Data Deluge:

  • Volume: the increasing amount of data
  • Variety: the increasing range of data types and sources
  • Velocity: the increasing speed of data

Laney’s three v’s stuck, and today most people familiar with Big Data have heard of them. That’s a far cry from saying, however, that everyone agrees on the proper definition of Big Data. Just about ...

Get Too Big to Ignore now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.