Chapter 2. Data Issues

Why the term “data science” is flawed but useful

Counterpoints to four common data science criticisms.

by Pete Warden

Mention “data science” to a lot of the high-profile people you might think practice it and you’re likely to see rolling eyes and shaking heads. It has taken me a while, but I’ve learned to love the term, despite my doubts. The key reason is that the rest of the world understands roughly what I mean when I use it. After years of stumbling through long-winded explanations about what I do, I can now say “I’m a data scientist” and move on. It is still an incredibly hazy definition, but my former descriptions left people confused as well, so this approach is no worse and at least saves time.

With that in mind, here are the arguments I’ve heard against the term, and why I don’t think they should stop its adoption.

It’s not a real science

I just finished reading “The Philosophical Breakfast Club,” the story of four Victorian friends who created the modern structure of science, as well as inventing the word “scientist.” I grew up with the idea that physics, chemistry and biology were the only real sciences and every other subject using the term was just stealing their clothes (“Anything that needs science in the name is not a real science”). The book shows that from the beginning the label was never restricted to just the hard experimental sciences. It was chosen to promote a disciplined approach to reasoning that relied on data rather than the poorly-supported ...

Get Big Data Now: Current Perspectives from O'Reilly Radar now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.