Data Science: The Hard Parts

by Daniel Vaughan
Released February 2024
Publisher(s): O'Reilly Media, Inc.
ISBN: 9781098146450

Book description

This practical guide provides a collection of techniques and best practices that are generally overlooked in most data engineering and data science pedagogy. A common misconception is that great data scientists are experts in the "big themes" of the discipline—machine learning and programming. But most of the time, these tools can only take us so far. In practice, the smaller tools and skills really separate a great data scientist from a not-so-great one.

Taken as a whole, the lessons in this book make the difference between an average data scientist candidate and a qualified data scientist working in the field. Author Daniel Vaughan has collected, extended, and used these skills to create value and train data scientists from different companies and industries.

With this book, you will:

  • Understand how data science creates value
  • Deliver compelling narratives to sell your data science project
  • Build a business case using unit economics principles
  • Create new features for a ML model using storytelling
  • Learn how to decompose KPIs
  • Perform growth decompositions to find root causes for changes in a metric

Daniel Vaughan is head of data at Clip, the leading paytech company in Mexico. He's the author of Analytical Skills for AI and Data Science (O'Reilly).

Table of contents

  1. 1. So What? Creating Value With Data Science
    1. What Is Value?
    2. What: Understanding the Business
    3. So What: the Gist of Value Creation in DS
    4. Now What: Be a Go-getter
    5. Measuring Value
    6. Key Takeaways
    7. Further Reading
  2. 2. Metrics Design
    1. Desirable Properties That Metrics Should Have
      1. Measurable
      2. Actionable
      3. Relevance
      4. Timeliness
    2. Metrics Decomposition
      1. Funnel Analytics
      2. Stock-flow Decompositions
      3. PxQ-type Decompositions
    3. Example: Another Revenue Decomposition
    4. Example: Marketplaces
    5. Key Takeaways
    6. Further Reading
  3. 3. Growth Decompositions: Understanding Tail and Headwinds
    1. Why Growth Decompositions
    2. Additive Decomposition
      1. Example
      2. Interpretation and Use Cases
    3. Multiplicative Decomposition
      1. Example
      2. Interpretation
    4. Mix-rate Decompositions
      1. Example
      2. Interpretation
    5. Mathematical Derivations
      1. Additive Decomposition
      2. Multiplicative Decomposition
      3. Mix-Rate Decomposition
    6. Key Takeaways
    7. Further Reading
  4. 4. 2x2 Designs
    1. The Case for Simplification
    2. What’s a 2x2 Design
    3. Example: Test a Model and a New Feature
    4. Example: Understanding User Behavior
    5. Example: Credit Origination and Acceptance
    6. Example: Prioritizing Your Workflow
    7. Key Takeaways
    8. Further Reading
  5. 5. Building Business Cases
    1. Some Principles to Construct Business Cases
    2. Example: Proactive Retention Strategy
    3. Fraud Prevention
    4. Purchasing External Datasets
    5. Working on a Data Science Project
    6. Key Takeaways
    7. Further Reading
  6. 6. What’s In a Lift?
    1. Lifts Defined
    2. Example: Classifier Model
    3. Self-selection and Survivorship Biases
    4. Other Examples
    5. Key Takeaways
    6. Further Reading
  7. 7. Narratives
    1. What’s In a Narrative: Telling a Story With Your Data
      1. Clear and to the Point
      2. Credible
      3. Memorable
      4. Actionable
    2. Building a Narrative
      1. Science as Storytelling
      2. What, So What and Now What
    3. The Last Mile
      1. Writing TL;DRs
      2. Tips to Write Memorable TL;DRs
      3. Delivering Powerful Elevator Pitches
      4. Presenting Your Narrative
    4. Key Takeaways
    5. Futher Reading
Product information

