CHAPTER 4

Big Data Solutions

Data is not information, information is not knowledge, knowledge is not understanding, understanding is not wisdom.

—Clifford Stoll

From the last chapter, we know that Big Data is a bit of a catchall term. So where do we go from here? This chapter provides an overview of the most prevalent Big Data tools and resources at reasonably high levels.56 This chapter will not delve into highly technical details. Don’t expect complicated schematics. This is a book about the business case for Big Data, not an implementation guide for any one application. As discussed in Chapter 1, old tools like relational database management system (RDBMSs) just can’t efficiently handle Big Data. Different times call for different solutions, and it’s time to get familiar with Hadoop, NoSQL, columnar databases, and other emerging Big Data tools.

Note that this is the closest thing to a technical chapter in the entire book. Here I endeavor to keep things at a relatively high level, not to inundate the reader with needless complexity. Yes, database schemas, nodes, clusters, in-memory databases, data compression, parallel processing, and other technical concepts are essential concepts that underscore Big Data. However, here they are intentionally kept to a bare minimum. This isn’t that type of book. The main point of this chapter is that Big Data encompasses a variety of new data sources and types, as well as increased data volumes and velocity. As such, to effectively utilize ...

Get Too Big to Ignore now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.