8. Putting It Together: MapReduce Data Pipelines
It’s kind of fun to do the impossible.
Human brains aren’t very good at keeping track of millions of separate data points, but we know that there is lots of data out there, just waiting to be collected, analyzed, and visualized. To cope with the complexity, we create metaphors to wrap our heads around the problem. Do we need to store millions of records until we figure out what to do with them? Let’s file them away in a data warehouse. Do we need to analyze a billion data points? Let’s crunch it down into something more manageable.
No longer should we be satisfied with just storing data and chipping away little bits of it to study. Now that distributed computational tools are becoming ...