We are up to our ears in data, but how much can this raw material really tell us? What actually makes it predictive? What are the most bizarre discoveries from data? When we find an interesting insight, why are we often better off not asking why? In what way is bigger data more dangerous? How do we avoid being fooled by random noise and ensure scientific discoveries are trustworthy?
Spotting the big data tsunami, analytics enthusiasts exclaim, “Surf's up!”
We've entered the golden age of predictive discoveries. A frenzy of number crunching churns out a bonanza of colorful, valuable, and sometimes surprising insights:1
- People who “like” curly fries on Facebook are more intelligent.
- Typing with proper capitalization indicates creditworthiness.
- Users of the Chrome and Firefox browsers make better employees.
- Men who skip breakfast are at greater risk for coronary heart disease.
- The demand for Pop-Tarts spikes before a hurricane.
- Female-named hurricanes are more deadly.
- High-crime neighborhoods demand more Uber rides.
A Cautionary Tale: Orange Lemons
Look like fun? Before you dive in, be warned: This spree of data exploration must be tamed with strict quality control. It's easy to get it wrong and end up with egg on your face.
In 2012, a Seattle Times article led with an eye-catching predictive discovery: “An orange used ...