book

Developing Analytic Talent: Becoming a Data Scientist

Name: Developing Analytic Talent: Becoming a Data Scientist
Author: Vincent Granville
ISBN: 9781118810088

by Vincent Granville

April 2014

Beginner

336 pages

8h 49m

English

Wiley

Read now

Unlock full access

Cover Page
Title Page
Copyright
Dedication
About the Author
About the Technical Editor
Credits
Acknowledgments
CHAPTER 1: What Is Data Science?
Real Versus Fake Data ScienceThe Data ScientistData Science Applications in 13 Real-World ScenariosData Science History, Pioneers, and Modern TrendsSummary
CHAPTER 2: Big Data Is Different
Two Big Data IssuesExamples of Big Data TechniquesWhat MapReduce Can't DoCommunication IssuesData Science: The End of Statistics?The Big Data EcosystemSummary

CHAPTER 3: Becoming a Data Scientist
Key Features of Data ScientistsTypes of Data ScientistsData Scientist DemographicsTraining for Data ScienceData Scientist Career PathsSummary
CHAPTER 4: Data Science Craftsmanship, Part I
New Types of MetricsChoosing Proper Analytics ToolsVisualizationStatistical Modeling Without ModelsThree Classes of Metrics: Centrality, Volatility, BumpinessStatistical Clustering for Big DataCorrelation and R-Squared for Big DataComputational ComplexityStructured CoefficientIdentifying the Number of ClustersInternet Topology MappingSecuring Communications: Data EncodingSummary
CHAPTER 5: Data Science Craftsmanship, Part II
Data DictionaryHidden Decision TreesModel-Free Confidence IntervalsRandom NumbersFour Ways to Solve a ProblemCausation Versus CorrelationHow Do You Detect Causes?Life Cycle of Data Science ProjectsPredictive Modeling MistakesLogistic-Related RegressionsExperimental DesignAnalytics as a Service and APIsMiscellaneous TopicsNew Synthetic Variance for Hadoop and Big DataSummary
CHAPTER 6: Data Science Application Case Studies
Stock MarketEncryptionFraud DetectionDigital AnalyticsMiscellaneousSummary
CHAPTER 7: Launching Your New Data Science Career
Job Interview QuestionsTesting Your Own Visual and Analytic ThinkingFrom Statistician to Data ScientistTaxonomy of a Data Scientist400 Data Scientist Job TitlesSalary SurveysSummary
CHAPTER 8: Data Science Resources
Professional ResourcesCareer-Building ResourcesSummary

Content preview from Developing Analytic Talent: Becoming a Data Scientist

CHAPTER5

Data Science Craftsmanship, Part II

In the previous chapter, you discovered a number of data science techniques and recipes, including visualizing data with data videos, new types of metrics, computer science topics, and questions to ask when choosing a vendor, as well as a comparison between data scientists, statisticians, and data engineers.

In this chapter, you consider material that is less focused on metrics and more focused on applications. It includes discussions on how to create a data dictionary, hidden decision trees, hash joins in the context of NoSQL databases, and the first Analyticbridge Theorem, which provides a simple, model-free, nonparametric way to compute confidence intervals without statistical theory or knowledge.

This chapter is less statistical theory–oriented compared with the previous chapter. The topics discussed in this chapter are typically classified as data analyses rather than statistical or computer analyses. Most of the material has not been published before. Case studies, applications, and success stories are discussed in the next chapter.

The topics discussed in this chapter, such as hidden decision trees, data dictionaries, and hash joins, are important subjects for data scientists because they are at the intersection of statistics and computer science, and are designed to handle big data. Traditional statisticians typically don't learn or use these techniques, but data scientists do.

Data Dictionary

One of the most valuable tools when ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Statistical Learning for Big Dependent Data

Publisher Resources

ISBN: 9781118810088Purchase book

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design