Skip to Content
Developing Analytic Talent: Becoming a Data Scientist
book

Developing Analytic Talent: Becoming a Data Scientist

by Vincent Granville
April 2014
Beginner
336 pages
8h 49m
English
Wiley
Content preview from Developing Analytic Talent: Becoming a Data Scientist

CHAPTER2

Big Data Is Different

In Chapter 1, you considered what data science is and is not, and saw how data science is more than data analysis, computer science, or statistics. This chapter further explores data science as a new discipline.

The chapter begins by considering two of the most important issues associated with big data. Then it works through some real-life examples of big data techniques, and considers some of the communication issues involved in an effective big data team environment. Finally, it considers how statistics is and will be part of data science, and touches on the elements of the big data ecosystem.

Two Big Data Issues

There are two issues associated with big data that must be discussed and understood: the “curse” of big data and rapid data flow. These two issues are discussed in the following sections.

The Curse of Big Data

The “curse” of big data is the danger involved in recklessly applying and scaling data science techniques that have worked well for small, medium, and large data sets, but don't necessarily work well for big data. This problem is well illustrated by the flaws found in big data trading (for which solutions are proposed in this chapter).

In short, the curse of big data is that when you search for patterns in large data sets with billions or trillions of data points and thousands of metrics, you are bound to identify coincidences that have no predictive power. Even worse, the strongest patterns might

  • Be caused entirely by chance (like ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Statistical Learning for Big Dependent Data

Statistical Learning for Big Dependent Data

Daniel Peña, Ruey S. Tsay
The Human Factor in AI-Based Decision-Making

The Human Factor in AI-Based Decision-Making

Philip Meissner, Christoph Keding

Publisher Resources

ISBN: 9781118810088Purchase book