book

Data Smart: Using Data Science to Transform Information into Insight

by John W. Foreman

November 2013

Beginner to intermediate

432 pages

10h 39m

English

Wiley

Audiobook available

Read now

Unlock full access

What Am I Doing Here?A Workable Definition of Data ScienceBut Wait, What about Big Data?Who Am I?Who Are You?No Regrets. Spreadsheets ForeverConventionsLet's Get Going
Some Sample DataMoving Quickly with the Control ButtonCopying Formulas and Data QuicklyFormatting CellsPaste Special ValuesInserting ChartsLocating the Find and Replace MenusFormulas for Locating and Pulling ValuesUsing VLOOKUP to Merge DataFiltering and SortingUsing PivotTablesUsing Array FormulasSolving Stuff with SolverOpenSolver: I Wish We Didn't Need This, but We DoWrapping Up

Girls Dance with Girls, Boys Scratch Their ElbowsGetting Real: K-Means Clustering Subscribers in E-mail MarketingK-Medians Clustering and Asymmetric Distance MeasurementsWrapping Up
When You Name a Product Mandrill, You're Going to Get Some Signal and Some NoiseThe World's Fastest Intro to Probability TheoryUsing Bayes Rule to Create an AI ModelLet's Get This Excel Party StartedWrapping Up
Why Should Data Scientists Know Optimization?Starting with a Simple Trade-OffFresh from the Grove to Your Glass...with a Pit Stop through a Blending ModelModeling RiskWrapping Up
What Is a Network Graph?Visualizing a Simple GraphBrief Introduction to GephiBuilding a Graph from the Wholesale Wine DataHow Much Is an Edge Worth? Points and Penalties in Graph ModularityLet's Get Clustering!There and Back Again: A Gephi TaleWrapping Up
Wait, What? You're Pregnant?Don't Kid YourselfPredicting Pregnant Customers at RetailMart Using Linear RegressionPredicting Pregnant Customers at RetailMart Using Logistic RegressionFor More InformationWrapping Up
Using the Data from Chapter 6Bagging: Randomize, Train, RepeatBoosting: If You Get It Wrong, Just Boost and Try AgainWrapping Up
The Sword Trade Is HoppingGetting Acquainted with Time Series DataStarting Slow with Simple Exponential SmoothingYou Might Have a TrendHolt's Trend-Corrected Exponential SmoothingMultiplicative Holt-Winters Exponential SmoothingWrapping Up
Outliers Are (Bad?) People, TooThe Fascinating Case of Hadlum v. HadlumTerrible at Nothing, Bad at EverythingWrapping Up
Getting Up and Running with RDoing Some Actual Data ScienceWrapping Up
Where Am I? What Just Happened?Before You Go-GoGet Creative and Keep in Touch!

Content preview from Data Smart: Using Data Science to Transform Information into Insight

2 Cluster Analysis Part I: Using K-Means to Segment Your Customer Base

I work in the e-mail marketing industry for a website called MailChimp.com. We help customers send e-mail newsletters to their audience, and every time someone uses the term “e-mail blast,” a little part of me dies.

Why? Because e-mail addresses are no longer black boxes that you lob “blasts” at like flash grenades. No, in e-mail marketing (as with many other forms of online engagement, including tweets, Facebook posts, and Pinterest campaigns), a business receives feedback on how their audience is engaging at the individual level through click tracking, online purchases, social sharing, and so on. This data is not noise. It characterizes your audience. But to the uninitiated, it might as well be Greek. Or Esperanto.

How do you take a bunch of transactional data from your customers (or audience, users, subscribers, citizens, and so on) and use it to understand them? When you're dealing with lots of people, it's hard to understand each customer personally, especially if they all have their own different ways in which they've engaged with you. Even if you could understand everyone at a personal level, that can be tough to act on.

You need to take this customer base and find a happy medium between “blasting” everyone as if they were the same faceless entity and understanding everything about everyone to create personalized marketing for each individual recipient. One way to strike this balance is to use clustering ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Building an Effective Data Science Practice: A Framework to Bootstrap and Manage a Successful Data Science Practice

Publisher Resources

ISBN: 9781118661468Purchase book

Data Smart: Using Data Science to Transform Information into Insight

by John W. Foreman

2

Cluster Analysis Part I: Using K-Means to Segment Your Customer Base

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

You might also like

Building an Effective Data Science Practice: A Framework to Bootstrap and Manage a Successful Data Science Practice

Illuminating Statistical Analysis Using Scenarios and Simulations

Python: Advanced Predictive Analytics

Nonlinear Parameter Optimization Using R Tools

Publisher Resources

2

Cluster Analysis Part I: Using K-Means to Segment Your Customer Base

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,and much more.

You might also like

Building an Effective Data Science Practice: A Framework to Bootstrap and Manage a Successful Data Science Practice

Illuminating Statistical Analysis Using Scenarios and Simulations

Python: Advanced Predictive Analytics

Nonlinear Parameter Optimization Using R Tools

Publisher Resources

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.