Part II. Tactics: Analytic Patterns
Now that youâve met the fundamental analytic machinery (in both its MapReduce and table-operation form), itâs time to put it to work.
This part of the book will equip you to think tactically (i.e., in terms of the changes you would like to make to the data). Each chapter introduces a repeatedly useful data transformation pattern, demonstrated in Pig (and, where weâd like to reinforce the record-by-record action, in Python as well).
One of this bookâs principles is to center demonstrations on an interesting and realistic problem from some domain. And whenever possible, we endeavor to indicate how the approach would extend to other domains, especially ones with an obvious business focus. The tactical patterns, however, are exactly those tools that crop up in nearly every domain: think of them as the screwdriver, torque wrench, lathe, and so forth of your toolkit. Now, if this book were called Big Mechanics for Chimps, we might introduce those tools by repairing and rebuilding a Volkswagen Beetle engine, or by building another lathe from scratch. Those lessons would carry over to anywhere machine tools apply: air conditioner repair, fixing your kidâs bike, or building a rocketship to Mars.
So we will focus this part of the book on the dataset we just introduced, what Nate Silver calls âthe perfect datasetâ: the sea of numbers surrounding the sport of baseball. The members of the Retrosheet and Baseball Databank projects have provided ...
Get Big Data for Chimps now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.