Chapter 2. Data and Data Infrastructure
A Brief History of Data
The nature of data has changed dramatically over the past three decades. In the 1990s, data that most enterprises used for business intelligence was transactional, generated by business processes and business applications. Examples of these applications included Enterprise Resource Planning (ERP) applications and Customer Relationship Management (CRM) systems, among others. This type of structured data included the data stored in data warehouses, Online Transaction Processing (OLTP) systems, Oracle and Teradata databases, and other types of conventional data repositories.
The need to manage transaction data dictated the way we built data infrastructures until the advent of the internet, when we started to see interaction data, or data generated by interactions between people or between machines. This semi-structured or unstructured data included web pages as well as the various types of social media, which were generated and consumed by people rather than machines. Music, video, pictures, social media comments, and so on fall into this category.
And then sensors began to play the interaction game, leading to machines interacting with other machines or other people. This type of interaction data was primarily created by machines monitoring various aspects of the environments: servers, networks, thermostats, lights, fitness devices, and so forth.
If we think back again to Gartner’s Three Vs of big data—volume, velocity, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access