Data Science and Machine Learning with Python – Hands-On!

Video description

This course starts with a Python crash course and then shows you how to get set up on Microsoft Windows-based PCs, Linux desktops, and Macs. After setup, we will cover the machine learning, AI, and data mining techniques real employers are looking for, including deep learning / neural networks with TensorFlow and Keras; generative models with variational auto-encoders and generative adversarial networks; data visualization in Python with Matplotlib and Seaborn; transfer learning, sentiment analysis, image recognition, and classification; regression analysis, K-Means Clustering, Principal Component Analysis, train/test and cross-validation, Bayesian methods, decision trees and random forests.

We will also cover multiple regression, multi-level models, support vector machines, reinforcement learning, collaborative filtering, K-Nearest Neighbor, bias/variance tradeoff, ensemble learning, term frequency / inverse document frequency, experimental design, and A/B tests, feature engineering, hyperparameter tuning, and much more! There’s also an entire section on machine learning with Apache Spark, which lets you scale up these techniques to “big data” analyzed on a computing cluster.

By the end of this course, you will be able to become a professional data scientist.

What You Will Learn

  • Implement machine learning on a massive scale with Apache Spark’s MLLib
  • Data visualization with Matplotlib and Seaborn
  • Understand reinforcement learning and how to build a Pac-Man bot
  • Use train/test and K-Fold cross-validation to choose and tune models
  • Build artificial neural networks with TensorFlow and Keras
  • Design and evaluate A/B tests using T-Tests and P-Values

Audience

Software developers or programmers who want to transition into the lucrative data science career path will learn a lot from this course. Data analysts in finance or other non-tech industries who want to transition into the tech industry can use this course to learn how to analyze data using code instead of tools.

You will need some prior experience in coding or scripting to be successful. If you have no prior coding or scripting experience, you should not take this course as we have covered the introductory Python course in the earlier sections.

About The Author

Frank Kane: Frank Kane has spent nine years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers all the time. He holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology and teaches others about big data analysis.

Publisher resources

Download Example Code

Table of contents

  1. Chapter 1 : Getting Started
    1. Introduction
    2. [Activity] Windows: Installing and Using Anaconda and Course Materials
    3. [Activity] MAC: Installing and Using Anaconda and Course Materials
    4. [Activity] Linux: Installing and Using Anaconda and Course Materials
    5. Python Basics, Part 1 [Optional]
    6. [Activity] Python Basics, Part 2 [Optional]
    7. [Activity] Python Basics, Part 3 [Optional]
    8. [Activity] Python Basics, Part 4 [Optional]
    9. Introducing the Pandas Library [Optional]
  2. Chapter 2 : Statistics and Probability Refresher, and Python Practice
    1. Types of Data (Numerical, Categorical, Ordinal)
    2. Mean, Median, Mode
    3. [Activity] Using Mean, Median, and Mode in Python
    4. [Activity] Variation and Standard Deviation
    5. Probability Density Function; Probability Mass Function
    6. Common Data Distributions (Normal, Binomial, Poisson, and So On)
    7. [Activity] Percentiles and Moments
    8. [Activity] A Crash Course in matplotlib
    9. [Activity] Advanced Visualization with Seaborn
    10. [Activity] Covariance and Correlation
    11. [Exercise] Conditional Probability
    12. Exercise Solution: Conditional Probability of Purchase by Age
    13. Bayes' Theorem
  3. Chapter 3 : Predictive Models
    1. [Activity] Linear Regression
    2. [Activity] Polynomial Regression
    3. [Activity] Multiple Regression and Predicting Car Prices
    4. Multi-Level Models
  4. Chapter 4 : Machine Learning with Python
    1. Supervised Versus Unsupervised Learning, and Train/Test
    2. [Activity] Using Train/Test to Prevent Overfitting a Polynomial Regression
    3. Bayesian Methods: Concepts
    4. [Activity] Implementing a Spam Classifier with Naive Bayes
    5. K-Means Clustering
    6. [Activity] Clustering People Based on Income and Age
    7. Measuring Entropy
    8. [Activity] Windows: Installing GraphViz
    9. [Activity] MAC: Installing GraphViz
    10. [Activity] Linux: Installing GraphViz
    11. Decision Trees: Concepts
    12. [Activity] Decision Trees: Predicting Hiring Decisions
    13. Ensemble Learning
    14. [Activity] XGBoost
    15. Support Vector Machines (SVM) Overview
    16. [Activity] Using SVM to Cluster People Using Scikit-Learn
  5. Chapter 5 : Recommender Systems
    1. User-Based Collaborative Filtering
    2. Item-Based Collaborative Filtering
    3. [Activity] Finding Movie Similarities Using Cosine Similarity
    4. [Activity] Improving the Results of Movie Similarities
    5. [Activity] Making Movie Recommendations with Item-Based Collaborative Filtering
    6. [Exercise] Improve the Recommender's Results
  6. Chapter 6 : More Data Mining and Machine Learning Techniques
    1. K-Nearest-Neighbors: Concepts
    2. [Activity] Using KNN to Predict a Rating for a Movie
    3. Dimensionality Reduction; Principal Component Analysis (PCA)
    4. [Activity] PCA Example with the Iris Dataset
    5. Data Warehousing Overview: ETL and ELT
    6. Reinforcement Learning
    7. [Activity] Reinforcement Learning and Q-Learning with Gym
    8. Understanding a Confusion Matrix
    9. Measuring Classifiers (Precision, Recall, F1, ROC, AUC)
  7. Chapter 7 : Dealing with Real-World Data
    1. Bias/Variance Tradeoff
    2. [Activity] K-Fold Cross-Validation to Avoid Overfitting
    3. Data Cleaning and Normalization
    4. [Activity] Cleaning Web Log Data
    5. Normalizing Numerical Data
    6. [Activity] Detecting Outliers
    7. Feature Engineering and the Curse of Dimensionality
    8. Imputation Techniques for Missing Data
    9. Handling Unbalanced Data: Oversampling, Undersampling, and SMOTE
    10. Binning, Transforming, Encoding, Scaling, and Shuffling
  8. Chapter 8 : Apache Spark: Machine Learning on Big Data
    1. [Activity] Installing Spark - Part 1
    2. [Activity] Installing Spark - Part 2
    3. Spark Introduction
    4. Spark and the Resilient Distributed Dataset (RDD)
    5. Introducing MLLib
    6. Introduction to Decision Trees in Spark
    7. [Activity] K-Means Clustering in Spark
    8. TF / IDF
    9. [Activity] Searching Wikipedia with Spark
    10. [Activity] Using the Spark DataFrame API for MLLib
  9. Chapter 9 : Experimental Design / ML in the Real World
    1. Deploying Models to Real-Time Systems
    2. A/B Testing Concepts
    3. T-Tests and P-Values
    4. [Activity] Hands-On with T-Tests
    5. Determining How Long to Run an Experiment
    6. A/B Test Gotchas
  10. Chapter 10 : Deep Learning and Neural Networks
    1. Deep Learning Prerequisites
    2. The History of Artificial Neural Networks
    3. [Activity] Deep Learning in the TensorFlow Playground
    4. Deep Learning Details
    5. Introducing TensorFlow
    6. [Activity] Using TensorFlow, Part 1
    7. [Activity] Using TensorFlow, Part 2
    8. [Activity] Introducing Keras
    9. [Activity] Using Keras to Predict Political Affiliations
    10. Convolutional Neural Networks (CNNs)
    11. [Activity] Using CNNs for Handwriting Recognition
    12. Recurrent Neural Networks (RNNs)
    13. [Activity] Using a RNN for Sentiment Analysis
    14. [Activity] Transfer Learning
    15. Tuning Neural Networks: Learning Rate and Batch Size Hyperparameters
    16. Deep Learning Regularization with Dropout and Early Stopping
    17. The Ethics of Deep Learning
  11. Chapter 11 : Generative Models
    1. Variational Auto-Encoders (VAEs) - How They Work
    2. Variational Auto-Encoders (VAE) - Hands-On with Fashion MNIST
    3. Generative Adversarial Networks (GANs) - How They Work
    4. Generative Adversarial Networks (GANs) - Playing with Some Demos
    5. Generative Adversarial Networks (GANs) - Hands-On with Fashion MNIST
    6. Learning More about Deep Learning
  12. Chapter 12 : Final Project
    1. Your Final Project Assignment: Mammogram Classification
    2. Final Project Review
  13. Chapter 13 : You Made It!
    1. More to Explore

Product information

  • Title: Data Science and Machine Learning with Python – Hands-On!
  • Author(s): Frank Kane
  • Release date: August 2022
  • Publisher(s): Packt Publishing
  • ISBN: 9781787127081