Chapter 2. Data and Data Infrastructure

A Brief History of Data

The nature of data has changed dramatically over the past three decades. In the 1990s, data that most enterprises used for business intelligence was transactional, generated by business processes and business applications. Examples of these applications included Enterprise Resource Planning (ERP) applications and Customer Relationship Management (CRM) systems, among others. This type of structured data included the data stored in data warehouses, Online Transaction Processing (OLTP) systems, Oracle and Teradata databases, and other types of conventional data repositories.

The need to manage transaction data dictated the way we built data infrastructures until the advent of the internet, when we started to see interaction data, or data generated by interactions between people or between machines. This semi-structured or unstructured data included web pages as well as the various types of social media, which were generated and consumed by people rather than machines. Music, video, pictures, social media comments, and so on fall into this category.

And then sensors began to play the interaction game, leading to machines interacting with other machines or other people. This type of interaction data was primarily created by machines monitoring various aspects of the environments: servers, networks, thermostats, lights, fitness devices, and so forth.

If we think back again to Gartner’s Three Vs of big data—volume, velocity, ...

Get Creating a Data-Driven Enterprise with DataOps now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.