Skip to Content
Data Science: The Hard Parts
book

Data Science: The Hard Parts

by Daniel Vaughan
November 2023
Beginner to intermediate
254 pages
6h 43m
English
O'Reilly Media, Inc.
Content preview from Data Science: The Hard Parts

Chapter 13. Storytelling in Machine Learning

In Chapter 7, I argued that data scientists ought to become better storytellers. This holds true in general, but it takes on special importance with regard to machine learning (ML).

This chapter walks you through the main aspects of storytelling in ML, starting with feature engineering and finishing with the problem of interpretability.

A Holistic View of Storytelling in ML

Storytelling plays two related but distinct roles in ML (Figure 13-1). The better-known role is a salesperson, where you need to engage with an audience, possibly to gain or maintain stakeholder buy-in, a process that usually takes place after you’ve developed a model. The lesser-known role is a scientist, where you need to find hypotheses that will guide you throughout the process of developing the model.

storytelling flow
Figure 13-1. Storytelling in ML

Since the former takes place after you have developed your model, I call it ex post storytelling; your scientist persona is mostly invoked before (ex ante) and during (interim) the process of training the model.

Ex Ante and Interim Storytelling

Ex ante storytelling has four main steps: defining the problem, creating hypotheses, feature engineering, and training the model (Figure 13-2). While they usually flow in that direction, there’s a feedback loop between all of them, so it’s not uncommon that after you train a first model, ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Learning Data Science

Learning Data Science

Sam Lau, Joseph Gonzalez, Deborah Nolan
Data Science for Business

Data Science for Business

Foster Provost, Tom Fawcett

Publisher Resources

ISBN: 9781098146467Errata Page