Chapter 4

Information Management

Information management is a combination of the foundation and the plumbing in a house. Who wants to think about information management? Not many. But just like a house, if the foundation isn’t solid, you’ll have all sorts of structural problems that can cause it to come crashing down. Or, at least, you’ll experience annoying problems that limit your abilities to live comfortably. Just like when you contract to have a house built, you want to invest in experienced builders who know how to build the foundation to meet today’s needs and anticipate future needs so you’ll have the ability to easily expand.

The Big Data Foundation

Just like the house analogy, the Big Data foundation is composed of two major systems. The first stores the data and the second processes it.

Big Data storage is often synonymously interchanged with the Hadoop File System (HDFS), but traditional data warehouses can also house Big Data. HDFS is distributed data storage that has become the de facto standard because you can store any type of data without limitations on the type or amount of data. One of the reasons HDFS has become so popular is that you don’t have to do any “set up” to store the data. In traditional databases, you need to do quite a bit of “set up” in order to store data. You have to understand the data that will be housed in the database and set up the database by creating a schema. The schema is the blueprint for how you’ll place data into tables with columns. ...

Get Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.