Data Science for Marketing Analytics

Book description

Explore new and more sophisticated tools that reduce your marketing analytics efforts and give you precise results

Key Features

  • Study new techniques for marketing analytics
  • Explore uses of machine learning to power your marketing analyses
  • Work through each stage of data analytics with the help of multiple examples and exercises

Book Description

Data Science for Marketing Analytics covers every stage of data analytics, from working with a raw dataset to segmenting a population and modeling different parts of the population based on the segments.

The book starts by teaching you how to use Python libraries, such as pandas and Matplotlib, to read data from Python, manipulate it, and create plots, using both categorical and continuous variables. Then, you'll learn how to segment a population into groups and use different clustering techniques to evaluate customer segmentation. As you make your way through the chapters, you'll explore ways to evaluate and select the best segmentation approach, and go on to create a linear regression model on customer value data to predict lifetime value. In the concluding chapters, you'll gain an understanding of regression techniques and tools for evaluating regression models, and explore ways to predict customer choice using classification algorithms. Finally, you'll apply these techniques to create a churn model for modeling customer product choices.

By the end of this book, you will be able to build your own marketing reporting and interactive dashboard solutions.

What you will learn

  • Analyze and visualize data in Python using pandas and Matplotlib
  • Study clustering techniques, such as hierarchical and k-means clustering
  • Create customer segments based on manipulated data
  • Predict customer lifetime value using linear regression
  • Use classification algorithms to understand customer choice
  • Optimize classification algorithms to extract maximal information

Who this book is for

Data Science for Marketing Analytics is designed for developers and marketing analysts looking to use new, more sophisticated tools in their marketing analytics efforts. It'll help if you have prior experience of coding in Python and knowledge of high school level mathematics. Some experience with databases, Excel, statistics, or Tableau is useful but not necessary.

Table of contents

  1. Preface
    1. About the Book
      1. About the Authors
      2. Objectives
      3. Audience
      4. Approach
      5. Minimum Hardware Requirements
      6. Software Requirements
      7. Conventions
      8. Installation and Setup
      9. Installing the Code Bundle
      10. Additional Resources
  2. Chapter 1
  3. Data Preparation and Cleaning
    1. Introduction
    2. Data Models and Structured Data
    3. pandas
      1. Importing and Exporting Data With pandas DataFrames
      2. Viewing and Inspecting Data in DataFrames
      3. Exercise 1: Importing JSON Files into pandas
      4. Exercise 2: Identifying Semi-Structured and Unstructured Data
      5. Structure of a pandas Series
    4. Data Manipulation
      1. Selecting and Filtering in pandas
      2. Creating Test DataFrames in Python
      3. Adding and Removing Attributes and Observations
      4. Exercise 3: Creating and Modifying Test DataFrames
      5. Combining Data
      6. Handling Missing Data
      7. Exercise 4: Combining DataFrames and Handling Missing Values
      8. Applying Functions and Operations on DataFrames
      9. Grouping Data
      10. Exercise 5: Applying Data Transformations
      11. Activity 1: Addressing Data Spilling
    5. Summary
  4. Chapter 2
  5. Data Exploration and Visualization
    1. Introduction
    2. Identifying the Right Attributes
      1. Exercise 6: Exploring the Attributes in Sales Data
    3. Generating Targeted Insights
      1. Selecting and Renaming Attributes
      2. Transforming Values
      3. Exercise 7: Targeting Insights for Specific Use Cases
      4. Reshaping the Data
      5. Exercise 8: Understanding Stacking and Unstacking
      6. Pivot Tables
    4. Visualizing Data
      1. Exercise 9: Visualizing Data With pandas
      2. Visualization through Seaborn
      3. Visualization with Matplotlib
      4. Activity 2: Analyzing Advertisements
    5. Summary
  6. Chapter 3
  7. Unsupervised Learning: Customer Segmentation
    1. Introduction
    2. Customer Segmentation Methods
      1. Traditional Segmentation Methods
      2. Unsupervised Learning (Clustering) for Customer Segmentation
    3. Similarity and Data Standardization
      1. Determining Similarity
      2. Standardizing Data
      3. Exercise 10: Standardizing Age and Income Data of Customers
      4. Calculating Distance
      5. Exercise 11: Calculating Distance Between Three Customers
      6. Activity 3: Loading, Standardizing, and Calculating Distance with a Dataset
    4. k-means Clustering
      1. Understanding k-means Clustering
      2. Exercise 12: k-means Clustering on Income/Age Data
      3. High-Dimensional Data
      4. Exercise 13: Dealing with High-Dimensional Data
      5. Activity 4: Using k-means Clustering on Customer Behavior Data
    5. Summary
  8. Chapter 4
  9. Choosing the Best Segmentation Approach
    1. Introduction
    2. Choosing the Number of Clusters
      1. Simple Visual Inspection
      2. Exercise 14: Choosing the Number of Clusters Based on Visual Inspection
      3. The Elbow Method with Sum of Squared Errors
      4. Exercise 15: Determining the Number of Clusters Using the Elbow Method
      5. Activity 5: Determining Clusters for High-End Clothing Customer Data Using the Elbow Method with the Sum of Squared Errors
    3. Different Methods of Clustering
      1. Mean-Shift Clustering
      2. Exercise 16: Performing Mean-Shift Clustering to Cluster Data
      3. k-modes and k-prototypes Clustering
      4. Exercise 17: Clustering Data Using the k-prototypes Method
      5. Activity 6: Using Different Clustering Techniques on Customer Behavior Data
    4. Evaluating Clustering
      1. Silhouette Score
      2. Exercise 18: Calculating Silhouette Score to Pick the Best k for k-means and Comparing to the Mean-Shift Algorithm
      3. Train and Test Split
      4. Exercise 19: Using a Train-Test Split to Evaluate Clustering Performance
      5. Activity 7: Evaluating Clustering on Customer Behavior Data
    5. Summary
  10. Chapter 5
  11. Predicting Customer Revenue Using Linear Regression
    1. Introduction
    2. Understanding Regression
    3. Feature Engineering for Regression
      1. Feature Creation
      2. Data Cleaning
      3. Exercise 20: Creating Features for Transaction Data
      4. Assessing Features Using Visualizations and Correlations
      5. Exercise 21: Examining Relationships between Predictors and Outcome
      6. Activity 8: Examining Relationships Between Storefront Locations and Features about Their Area
    4. Performing and Interpreting Linear Regression
      1. Exercise 22: Building a Linear Model Predicting Customer Spend
      2. Activity 9: Building a Regression Model to Predict Storefront Location Revenue
    5. Summary
  12. Chapter 6
  13. Other Regression Techniques and Tools for Evaluation
    1. Introduction
    2. Evaluating the Accuracy of a Regression Model
      1. Residuals and Errors
      2. Mean Absolute Error
      3. Root Mean Squared Error
      4. Exercise 23: Evaluating Regression Models of Location Revenue Using MAE and RMSE
      5. Activity 10: Testing Which Variables are Important for Predicting Responses to a Marketing Offer
    3. Using Regularization for Feature Selection
      1. Exercise 24: Using Lasso Regression for Feature Selection
      2. Activity 11: Using Lasso Regression to Choose Features for Predicting Customer Spend
    4. Tree-Based Regression Models
      1. Random Forests
      2. Exercise 25: Using Tree-Based Regression Models to Capture Non-Linear Trends
      3. Activity 12: Building the Best Regression Model for Customer Spend Based on Demographic Data
    5. Summary
  14. Chapter 7
  15. Supervised Learning: Predicting Customer Churn
    1. Introduction
    2. Classification Problems
    3. Understanding Logistic Regression
      1. Revisiting Linear Regression
      2. Logistic Regression
      3. Exercise 26: Plotting the Sigmoid Function
      4. Cost Function for Logistic Regression
      5. Assumptions of Logistic Regression
      6. Exercise 27: Loading, Splitting, and Applying Linear and Logistic Regression to Data
    4. Creating a Data Science Pipeline
      1. Obtaining the Data
      2. Exercise 28: Obtaining the Data
      3. Scrubbing the Data
      4. Exercise 29: Imputing Missing Values
      5. Exercise 30: Renaming Columns and Changing the Data Type
      6. Exploring the Data
      7. Statistical Overview
      8. Correlation
      9. Exercise 31: Obtaining the Statistical Overview and Correlation Plot
      10. Visualizing the Data
      11. Exercise 32: Performing Exploratory Data Analysis (EDA)
      12. Activity 13: Performing OSE of OSEMN
    5. Modeling the Data
      1. Feature Selection
      2. Exercise 33: Performing Feature Selection
      3. Model Building
      4. Exercise 34: Building a Logistic Regression Model
      5. Interpreting the Data
      6. Activity 14: Performing MN of OSEMN
    6. Summary
  16. Chapter 8
  17. Fine-Tuning Classification Algorithms
    1. Introduction
    2. Support Vector Machines
      1. Intuition Behind Maximum Margin
      2. Linearly Inseparable Cases
      3. Linearly Inseparable Cases Using Kernel
      4. Exercise 35: Training an SVM Algorithm Over a Dataset
    3. Decision Trees
      1. Exercise 36: Implementing a Decision Tree Algorithm Over a Dataset
      2. Important Terminology of Decision Trees
      3. Decision Tree Algorithm Formulation
    4. Random Forest
      1. Exercise 37: Implementing a Random Forest Model Over a Dataset
      2. Activity 15: Implementing Different Classification Algorithms
    5. Preprocessing Data for Machine Learning Models
      1. Standardization
      2. Exercise 38: Standardizing Data
      3. Scaling
      4. Exercise 39: Scaling Data After Feature Selection
      5. Normalization
      6. Exercise 40: Performing Normalization on Data
    6. Model Evaluation
      1. Exercise 41: Implementing Stratified k-fold
      2. Fine-Tuning of the Model
      3. Exercise 42: Fine-Tuning a Model
      4. Activity 16: Tuning and Optimizing the Model
    7. Performance Metrics
      1. Precision
      2. Recall
      3. F1 Score
      4. Exercise 43: Evaluating the Performance Metrics for a Model
      5. ROC Curve
      6. Exercise 44: Plotting the ROC Curve
      7. Activity 17: Comparison of the Models
    8. Summary
  18. Chapter 9
  19. Modeling Customer Choice
    1. Introduction
    2. Understanding Multiclass Classification
      1. Classifiers in Multiclass Classification
      2. Exercise 45: Implementing a Multiclass Classification Algorithm on a Dataset
      3. Performance Metrics
      4. Exercise 46: Evaluating Performance Using Multiclass Performance Metrics
      5. Activity 18: Performing Multiclass Classification and Evaluating Performance
    3. Class Imbalanced Data
      1. Exercise 47: Performing Classification on Imbalanced Data
      2. Dealing with Class-Imbalanced Data
      3. Exercise 48: Visualizing Sampling Techniques
      4. Exercise 49: Fitting a Random Forest Classifier Using SMOTE and Building the Confusion Matrix
      5. Activity 19: Dealing with Imbalanced Data
    4. Summary
  20. Appendix
    1. Chapter 1: Data Preparation and Cleaning
      1. Activity 1: Addressing Data Spilling
    2. Chapter 2: Data Exploration and Visualization
      1. Activity 2: Analyzing Advertisements
    3. Chapter 3: Unsupervised Learning: Customer Segmentation
      1. Activity 3: Loading, Standardizing, and Calculating Distance with a Dataset
      2. Activity 4: Using k-means Clustering on Customer Behavior Data
    4. Chapter 4: Choosing the Best Segmentation Approach
      1. Activity 5: Determining Clusters for High-End Clothing Customer Data Using the Elbow Method with the Sum of Squared Errors
      2. Activity 6: Using Different Clustering Techniques on Customer Behavior Data
      3. Activity 7: Evaluating Clustering on Customer Behavior Data
    5. Chapter 5: Predicting Customer Revenue Using Linear Regression
      1. Activity 8: Examining Relationships between Storefront Locations and Features about their Area
      2. Activity 9: Building a Regression Model to Predict Storefront Location Revenue
    6. Chapter 6: Other Regression Techniques and Tools for Evaluation
      1. Activity 10: Testing Which Variables are Important for Predicting Responses to a Marketing Offer
      2. Activity 11: Using Lasso Regression to Choose Features for Predicting Customer Spend
      3. Activity 12: Building the Best Regression Model for Customer Spend Based on Demographic Data
    7. Chapter 7: Supervised Learning: Predicting Customer Churn
      1. Activity 13: Performing OSE from OSEMN
      2. Activity 14: Performing MN of OSEMN
    8. Chapter 8: Fine-Tuning Classification Algorithms
      1. Activity 15: Implementing Different Classification Algorithms
      2. Activity 16: Tuning and Optimizing the Model
      3. Activity 17: Comparison of the Models
    9. Chapter 9: Modeling Customer Choice
      1. Activity 18: Performing Multiclass Classification and Evaluating Performance
      2. Activity 19: Dealing with Imbalanced Data

Product information

  • Title: Data Science for Marketing Analytics
  • Author(s): Tommy Blanchard, Debasish Behera, Pranshu Bhatnagar
  • Release date: March 2019
  • Publisher(s): Packt Publishing
  • ISBN: 9781789959413