Skip to Content
Developing Analytic Talent: Becoming a Data Scientist
book

Developing Analytic Talent: Becoming a Data Scientist

by Vincent Granville
April 2014
Beginner
336 pages
8h 49m
English
Wiley
Content preview from Developing Analytic Talent: Becoming a Data Scientist

CHAPTER5

Data Science Craftsmanship, Part II

In the previous chapter, you discovered a number of data science techniques and recipes, including visualizing data with data videos, new types of metrics, computer science topics, and questions to ask when choosing a vendor, as well as a comparison between data scientists, statisticians, and data engineers.

In this chapter, you consider material that is less focused on metrics and more focused on applications. It includes discussions on how to create a data dictionary, hidden decision trees, hash joins in the context of NoSQL databases, and the first Analyticbridge Theorem, which provides a simple, model-free, nonparametric way to compute confidence intervals without statistical theory or knowledge.

This chapter is less statistical theory–oriented compared with the previous chapter. The topics discussed in this chapter are typically classified as data analyses rather than statistical or computer analyses. Most of the material has not been published before. Case studies, applications, and success stories are discussed in the next chapter.

The topics discussed in this chapter, such as hidden decision trees, data dictionaries, and hash joins, are important subjects for data scientists because they are at the intersection of statistics and computer science, and are designed to handle big data. Traditional statisticians typically don't learn or use these techniques, but data scientists do.

Data Dictionary

One of the most valuable tools when ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Statistical Learning for Big Dependent Data

Statistical Learning for Big Dependent Data

Daniel Peña, Ruey S. Tsay
The Human Factor in AI-Based Decision-Making

The Human Factor in AI-Based Decision-Making

Philip Meissner, Christoph Keding

Publisher Resources

ISBN: 9781118810088Purchase book