Data Science for Marketing Analytics

Book description

Turbocharge your marketing plans by making the leap from simple descriptive statistics in Excel to sophisticated predictive analytics with the Python programming language

Key Features

  • Use data analytics and machine learning in a sales and marketing context
  • Gain insights from data to make better business decisions
  • Build your experience and confidence with realistic hands-on practice

Book Description

Unleash the power of data to reach your marketing goals with this practical guide to data science for business.

This book will help you get started on your journey to becoming a master of marketing analytics with Python. You'll work with relevant datasets and build your practical skills by tackling engaging exercises and activities that simulate real-world market analysis projects.

You'll learn to think like a data scientist, build your problem-solving skills, and discover how to look at data in new ways to deliver business insights and make intelligent data-driven decisions.

As well as learning how to clean, explore, and visualize data, you'll implement machine learning algorithms and build models to make predictions. As you work through the book, you'll use Python tools to analyze sales, visualize advertising data, predict revenue, address customer churn, and implement customer segmentation to understand behavior.

By the end of this book, you'll have the knowledge, skills, and confidence to implement data science and machine learning techniques to better understand your marketing data and improve your decision-making.

What you will learn

  • Load, clean, and explore sales and marketing data using pandas
  • Form and test hypotheses using real data sets and analytics tools
  • Visualize patterns in customer behavior using Matplotlib
  • Use advanced machine learning models like random forest and SVM
  • Use various unsupervised learning algorithms for customer segmentation
  • Use supervised learning techniques for sales prediction
  • Evaluate and compare different models to get the best outcomes
  • Optimize models with hyperparameter tuning and SMOTE

Who this book is for

This marketing book is for anyone who wants to learn how to use Python for cutting-edge marketing analytics. Whether you're a developer who wants to move into marketing, or a marketing analyst who wants to learn more sophisticated tools and techniques, this book will get you on the right path.

Basic prior knowledge of Python and experience working with data will help you access this book more easily.

Table of contents

  1. Data Science for Marketing Analytics
  2. second edition
  3. Preface
    1. About the Book
      1. About the Authors
      2. Who This Book Is For
      3. About the Chapters
      4. Conventions
      5. Code Presentation
      6. Minimum Hardware Requirements
      7. Downloading the Code Bundle
      8. Setting Up Your Environment
        1. Installing Anaconda on Your System
        2. Launching Jupyter Notebook
        3. Installing the ds-marketing Virtual Environment
      9. Running the Code Online Using Binder
      10. Get in Touch
      11. Please Leave a Review
  4. 1. Data Preparation and Cleaning
    1. Introduction
    2. Data Models and Structured Data
    3. pandas
      1. Importing and Exporting Data with pandas DataFrames
      2. Viewing and Inspecting Data in DataFrames
      3. Exercise 1.01: Loading Data Stored in a JSON File
      4. Exercise 1.02: Loading Data from Multiple Sources
      5. Structure of a pandas DataFrame and Series
    4. Data Manipulation
      1. Selecting and Filtering in pandas
      2. Creating DataFrames in Python
      3. Adding and Removing Attributes and Observations
      4. Combining Data
      5. Handling Missing Data
      6. Exercise 1.03: Combining DataFrames and Handling Missing Values
      7. Applying Functions and Operations on DataFrames
      8. Grouping Data
      9. Exercise 1.04: Applying Data Transformations
      10. Activity 1.01: Addressing Data Spilling
    5. Summary
  5. 2. Data Exploration and Visualization
    1. Introduction
    2. Identifying and Focusing on the Right Attributes
      1. The groupby(  ) Function
      2. The unique(  ) function
      3. The value_counts(  ) function
      4. Exercise 2.01: Exploring the Attributes in Sales Data
    3. Fine Tuning Generated Insights
      1. Selecting and Renaming Attributes
      2. Reshaping the Data
      3. Exercise 2.02: Calculating Conversion Ratios for Website Ads.
      4. Pivot Tables
    4. Visualizing Data
      1. Exercise 2.03: Visualizing Data With pandas
      2. Visualization through Seaborn
      3. Visualization with Matplotlib
      4. Activity 2.01: Analyzing Advertisements
    5. Summary
  6. 3. Unsupervised Learning and Customer Segmentation
    1. Introduction
    2. Segmentation
      1. Exercise 3.01: Mall Customer Segmentation – Understanding the Data
    3. Approaches to Segmentation
      1. Traditional Segmentation Methods
      2. Exercise 3.02: Traditional Segmentation of Mall Customers
      3. Unsupervised Learning (Clustering) for Customer Segmentation
    4. Choosing Relevant Attributes (Segmentation Criteria)
      1. Standardizing Data
      2. Exercise 3.03: Standardizing Customer Data
      3. Calculating Distance
      4. Exercise 3.04: Calculating the Distance between Customers
    5. K-Means Clustering
      1. Exercise 3.05: K-Means Clustering on Mall Customers
      2. Understanding and Describing the Clusters
      3. Activity 3.01: Bank Customer Segmentation for Loan Campaign
      4. Clustering with High-Dimensional Data
      5. Exercise 3.06: Dealing with High-Dimensional Data
      6. Activity 3.02: Bank Customer Segmentation with Multiple Features
    6. Summary
  7. 4. Evaluating and Choosing the Best Segmentation Approach
    1. Introduction
    2. Choosing the Number of Clusters
      1. Exercise 4.01: Data Staging and Visualization
      2. Simple Visual Inspection to Choose the Optimal Number of Clusters
      3. Exercise 4.02: Choosing the Number of Clusters Based on Visual Inspection
      4. The Elbow Method with Sum of Squared Errors
      5. Exercise 4.03: Determining the Number of Clusters Using the Elbow Method
      6. Activity 4.01: Optimizing a Luxury Clothing Brand's Marketing Campaign Using Clustering
    3. More Clustering Techniques
      1. Mean-Shift Clustering
      2. Exercise 4.04: Mean-Shift Clustering on Mall Customers
      3. Benefits and Drawbacks of the Mean-Shift Technique
      4. k-modes and k-prototypes Clustering
      5. Exercise 4.05: Clustering Data Using the k-prototypes Method
    4. Evaluating Clustering
      1. Silhouette Score
      2. Exercise 4.06: Using Silhouette Score to Pick Optimal Number of Clusters
      3. Train and Test Split
      4. Exercise 4.07: Using a Train-Test Split to Evaluate Clustering Performance
      5. Activity 4.02: Evaluating Clustering on Customer Data
      6. The Role of Business in Cluster Evaluation
    5. Summary
  8. 5. Predicting Customer Revenue Using Linear Regression
    1. Introduction
    2. Regression Problems
      1. Exercise 5.01: Predicting Sales from Advertising Spend Using Linear Regression
    3. Feature Engineering for Regression
      1. Feature Creation
      2. Data Cleaning
      3. Exercise 5.02: Creating Features for Customer Revenue Prediction
      4. Assessing Features Using Visualizations and Correlations
      5. Exercise 5.03: Examining Relationships between Predictors and the Outcome
      6. Activity 5.01: Examining the Relationship between Store Location and Revenue
    4. Performing and Interpreting Linear Regression
      1. Exercise 5.04: Building a Linear Model Predicting Customer Spend
      2. Activity 5.02: Predicting Store Revenue Using Linear Regression
    5. Summary
  9. 6. More Tools and Techniques for Evaluating Regression Models
    1. Introduction
    2. Evaluating the Accuracy of a Regression Model
      1. Residuals and Errors
      2. Mean Absolute Error
      3. Root Mean Squared Error
      4. Exercise 6.01: Evaluating Regression Models of Location Revenue Using the MAE and RMSE
      5. Activity 6.01: Finding Important Variables for Predicting Responses to a Marketing Offer
    3. Using Recursive Feature Selection for Feature Elimination
      1. Exercise 6.02: Using RFE for Feature Selection
      2. Activity 6.02: Using RFE to Choose Features for Predicting Customer Spend
    4. Tree-Based Regression Models
      1. Random Forests
      2. Exercise 6.03: Using Tree-Based Regression Models to Capture Non-Linear Trends
      3. Activity 6.03: Building the Best Regression Model for Customer Spend Based on Demographic Data
    5. Summary
  10. 7. Supervised Learning: Predicting Customer Churn
    1. Introduction
    2. Classification Problems
    3. Understanding Logistic Regression
      1. Revisiting Linear Regression
    4. Logistic Regression
      1. Cost Function for Logistic Regression
      2. Assumptions of Logistic Regression
      3. Exercise 7.01: Comparing Predictions by Linear and Logistic Regression on the Shill Bidding Dataset
    5. Creating a Data Science Pipeline
    6. Churn Prediction Case Study
      1. Obtaining the Data
      2. Exercise 7.02: Obtaining the Data
      3. Scrubbing the Data
      4. Exercise 7.03: Imputing Missing Values
      5. Exercise 7.04: Renaming Columns and Changing the Data Type
      6. Exploring the Data
      7. Exercise 7.05: Obtaining the Statistical Overview and Correlation Plot
      8. Visualizing the Data
      9. Exercise 7.06: Performing Exploratory Data Analysis (EDA)
      10. Activity 7.01: Performing the OSE technique from OSEMN
    7. Modeling the Data
      1. Feature Selection
      2. Exercise 7.07: Performing Feature Selection
      3. Model Building
      4. Exercise 7.08: Building a Logistic Regression Model
      5. Interpreting the Data
      6. Activity 7.02: Performing the MN technique from OSEMN
    8. Summary
  11. 8. Fine-Tuning Classification Algorithms
    1. Introduction
    2. Support Vector Machines
      1. Intuition behind Maximum Margin
      2. Linearly Inseparable Cases
      3. Linearly Inseparable Cases Using the Kernel
      4. Exercise 8.01: Training an SVM Algorithm Over a Dataset
    3. Decision Trees
      1. Exercise 8.02: Implementing a Decision Tree Algorithm over a Dataset
      2. Important Terminology for Decision Trees
      3. Decision Tree Algorithm Formulation
    4. Random Forest
      1. Exercise 8.03: Implementing a Random Forest Model over a Dataset
      2. Classical Algorithms – Accuracy Compared
      3. Activity 8.01: Implementing Different Classification Algorithms
    5. Preprocessing Data for Machine Learning Models
      1. Standardization
      2. Exercise 8.04: Standardizing Data
      3. Scaling
      4. Exercise 8.05: Scaling Data After Feature Selection
      5. Normalization
      6. Exercise 8.06: Performing Normalization on Data
    6. Model Evaluation
      1. Exercise 8.07: Stratified K-fold
      2. Fine-Tuning of the Model
      3. Exercise 8.08: Fine-Tuning a Model
      4. Activity 8.02: Tuning and Optimizing the Model
    7. Performance Metrics
      1. Precision
      2. Recall
      3. F1 Score
      4. Exercise 8.09: Evaluating the Performance Metrics for a Model
      5. ROC Curve
      6. Exercise 8.10: Plotting the ROC Curve
      7. Activity 8.03: Comparison of the Models
    8. Summary
  12. 9. Multiclass Classification Algorithms
    1. Introduction
    2. Understanding Multiclass Classification
    3. Classifiers in Multiclass Classification
      1. Exercise 9.01: Implementing a Multiclass Classification Algorithm on a Dataset
    4. Performance Metrics
      1. Exercise 9.02: Evaluating Performance Using Multiclass Performance Metrics
      2. Activity 9.01: Performing Multiclass Classification and Evaluating Performance
    5. Class-Imbalanced Data
      1. Exercise 9.03: Performing Classification on Imbalanced Data
      2. Dealing with Class-Imbalanced Data
      3. Exercise 9.04: Fixing the Imbalance of a Dataset Using SMOTE
      4. Activity 9.02: Dealing with Imbalanced Data Using scikit-learn
    6. Summary
  13. Appendix
    1. 1. Data Preparation and Cleaning
      1. Activity 1.01: Addressing Data Spilling
    2. 2. Data Exploration and Visualization
      1. Activity 2.01: Analyzing Advertisements
    3. 3. Unsupervised Learning and Customer Segmentation
      1. Activity 3.01: Bank Customer Segmentation for Loan Campaign
      2. Activity 3.02: Bank Customer Segmentation with Multiple Features
    4. 4. Evaluating and Choosing the Best Segmentation Approach
      1. Activity 4.01: Optimizing a Luxury Clothing Brand's Marketing Campaign Using Clustering
      2. Activity 4.02: Evaluating Clustering on Customer Data
    5. 5. Predicting Customer Revenue Using Linear Regression
      1. Activity 5.01: Examining the Relationship between Store Location and Revenue
      2. Activity 5.02: Predicting Store Revenue Using Linear Regression
    6. 6. More Tools and Techniques for Evaluating Regression Models
      1. Activity 6.01: Finding Important Variables for Predicting Responses to a Marketing Offer
      2. Activity 6.02: Using RFE to Choose Features for Predicting Customer Spend
      3. Activity 6.03: Building the Best Regression Model for Customer Spend Based on Demographic Data
    7. 7. Supervised Learning: Predicting Customer Churn
      1. Activity 7.01: Performing the OSE technique from OSEMN
      2. Activity 7.02: Performing the MN technique from OSEMN
    8. 8. Fine-Tuning Classification Algorithms
      1. Activity 8.01: Implementing Different Classification Algorithms
      2. Activity 8.02: Tuning and Optimizing the Model
      3. Activity 8.03: Comparison of the Models
    9. 9. Multiclass Classification Algorithms
      1. Activity 9.01: Performing Multiclass Classification and Evaluating Performance
      2. Activity 9.02: Dealing with Imbalanced Data Using scikit-learn
      3. Hey!

Product information

  • Title: Data Science for Marketing Analytics
  • Author(s): Mirza Rahim Baig, Gururajan Govindan, Vishwesh Ravi Shrimali
  • Release date: September 2021
  • Publisher(s): Packt Publishing
  • ISBN: 9781800560475