Video description
This course starts with a Python crash course and then shows you how to get set up on Microsoft Windows-based PCs, Linux desktops, and Macs. After setup, we will cover the machine learning, AI, and data mining techniques real employers are looking for, including deep learning / neural networks with TensorFlow and Keras; generative models with variational auto-encoders and generative adversarial networks; data visualization in Python with Matplotlib and Seaborn; transfer learning, sentiment analysis, image recognition, and classification; regression analysis, K-Means Clustering, Principal Component Analysis, train/test and cross-validation, Bayesian methods, decision trees and random forests.
We will also cover multiple regression, multi-level models, support vector machines, reinforcement learning, collaborative filtering, K-Nearest Neighbor, bias/variance tradeoff, ensemble learning, term frequency / inverse document frequency, experimental design, and A/B tests, feature engineering, hyperparameter tuning, and much more! There’s also an entire section on machine learning with Apache Spark, which lets you scale up these techniques to “big data” analyzed on a computing cluster.
By the end of this course, you will be able to become a professional data scientist.
What You Will Learn
- Implement machine learning on a massive scale with Apache Spark’s MLLib
- Data visualization with Matplotlib and Seaborn
- Understand reinforcement learning and how to build a Pac-Man bot
- Use train/test and K-Fold cross-validation to choose and tune models
- Build artificial neural networks with TensorFlow and Keras
- Design and evaluate A/B tests using T-Tests and P-Values
Audience
Software developers or programmers who want to transition into the lucrative data science career path will learn a lot from this course. Data analysts in finance or other non-tech industries who want to transition into the tech industry can use this course to learn how to analyze data using code instead of tools.
You will need some prior experience in coding or scripting to be successful. If you have no prior coding or scripting experience, you should not take this course as we have covered the introductory Python course in the earlier sections.
About The Author
Frank Kane: Frank Kane has spent nine years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers all the time. He holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology and teaches others about big data analysis.
Publisher resources
Table of contents
-
Chapter 1 : Getting Started
- Introduction
- [Activity] Windows: Installing and Using Anaconda and Course Materials
- [Activity] MAC: Installing and Using Anaconda and Course Materials
- [Activity] Linux: Installing and Using Anaconda and Course Materials
- Python Basics, Part 1 [Optional]
- [Activity] Python Basics, Part 2 [Optional]
- [Activity] Python Basics, Part 3 [Optional]
- [Activity] Python Basics, Part 4 [Optional]
- Introducing the Pandas Library [Optional]
-
Chapter 2 : Statistics and Probability Refresher, and Python Practice
- Types of Data (Numerical, Categorical, Ordinal)
- Mean, Median, Mode
- [Activity] Using Mean, Median, and Mode in Python
- [Activity] Variation and Standard Deviation
- Probability Density Function; Probability Mass Function
- Common Data Distributions (Normal, Binomial, Poisson, and So On)
- [Activity] Percentiles and Moments
- [Activity] A Crash Course in matplotlib
- [Activity] Advanced Visualization with Seaborn
- [Activity] Covariance and Correlation
- [Exercise] Conditional Probability
- Exercise Solution: Conditional Probability of Purchase by Age
- Bayes' Theorem
- Chapter 3 : Predictive Models
-
Chapter 4 : Machine Learning with Python
- Supervised Versus Unsupervised Learning, and Train/Test
- [Activity] Using Train/Test to Prevent Overfitting a Polynomial Regression
- Bayesian Methods: Concepts
- [Activity] Implementing a Spam Classifier with Naive Bayes
- K-Means Clustering
- [Activity] Clustering People Based on Income and Age
- Measuring Entropy
- [Activity] Windows: Installing GraphViz
- [Activity] MAC: Installing GraphViz
- [Activity] Linux: Installing GraphViz
- Decision Trees: Concepts
- [Activity] Decision Trees: Predicting Hiring Decisions
- Ensemble Learning
- [Activity] XGBoost
- Support Vector Machines (SVM) Overview
- [Activity] Using SVM to Cluster People Using Scikit-Learn
-
Chapter 5 : Recommender Systems
- User-Based Collaborative Filtering
- Item-Based Collaborative Filtering
- [Activity] Finding Movie Similarities Using Cosine Similarity
- [Activity] Improving the Results of Movie Similarities
- [Activity] Making Movie Recommendations with Item-Based Collaborative Filtering
- [Exercise] Improve the Recommender's Results
-
Chapter 6 : More Data Mining and Machine Learning Techniques
- K-Nearest-Neighbors: Concepts
- [Activity] Using KNN to Predict a Rating for a Movie
- Dimensionality Reduction; Principal Component Analysis (PCA)
- [Activity] PCA Example with the Iris Dataset
- Data Warehousing Overview: ETL and ELT
- Reinforcement Learning
- [Activity] Reinforcement Learning and Q-Learning with Gym
- Understanding a Confusion Matrix
- Measuring Classifiers (Precision, Recall, F1, ROC, AUC)
-
Chapter 7 : Dealing with Real-World Data
- Bias/Variance Tradeoff
- [Activity] K-Fold Cross-Validation to Avoid Overfitting
- Data Cleaning and Normalization
- [Activity] Cleaning Web Log Data
- Normalizing Numerical Data
- [Activity] Detecting Outliers
- Feature Engineering and the Curse of Dimensionality
- Imputation Techniques for Missing Data
- Handling Unbalanced Data: Oversampling, Undersampling, and SMOTE
- Binning, Transforming, Encoding, Scaling, and Shuffling
-
Chapter 8 : Apache Spark: Machine Learning on Big Data
- [Activity] Installing Spark - Part 1
- [Activity] Installing Spark - Part 2
- Spark Introduction
- Spark and the Resilient Distributed Dataset (RDD)
- Introducing MLLib
- Introduction to Decision Trees in Spark
- [Activity] K-Means Clustering in Spark
- TF / IDF
- [Activity] Searching Wikipedia with Spark
- [Activity] Using the Spark DataFrame API for MLLib
- Chapter 9 : Experimental Design / ML in the Real World
-
Chapter 10 : Deep Learning and Neural Networks
- Deep Learning Prerequisites
- The History of Artificial Neural Networks
- [Activity] Deep Learning in the TensorFlow Playground
- Deep Learning Details
- Introducing TensorFlow
- [Activity] Using TensorFlow, Part 1
- [Activity] Using TensorFlow, Part 2
- [Activity] Introducing Keras
- [Activity] Using Keras to Predict Political Affiliations
- Convolutional Neural Networks (CNNs)
- [Activity] Using CNNs for Handwriting Recognition
- Recurrent Neural Networks (RNNs)
- [Activity] Using a RNN for Sentiment Analysis
- [Activity] Transfer Learning
- Tuning Neural Networks: Learning Rate and Batch Size Hyperparameters
- Deep Learning Regularization with Dropout and Early Stopping
- The Ethics of Deep Learning
-
Chapter 11 : Generative Models
- Variational Auto-Encoders (VAEs) - How They Work
- Variational Auto-Encoders (VAE) - Hands-On with Fashion MNIST
- Generative Adversarial Networks (GANs) - How They Work
- Generative Adversarial Networks (GANs) - Playing with Some Demos
- Generative Adversarial Networks (GANs) - Hands-On with Fashion MNIST
- Learning More about Deep Learning
- Chapter 12 : Final Project
- Chapter 13 : You Made It!
Product information
- Title: Data Science and Machine Learning with Python – Hands-On!
- Author(s):
- Release date: August 2022
- Publisher(s): Packt Publishing
- ISBN: 9781787127081
You might also like
video
Master SQL for Data Analysis
SQL is a popular language for extracting, stacking, and querying data from databases. Master SQL to …
video
Build a CI/CD Pipeline
Approximately 8 Hours of Video Instruction If your development team is still dealing with manual and …
video
Microsoft Power BI - The Complete Masterclass [2023 EDITION]
Microsoft Power BI is an interactive data visualization software primarily focusing on business intelligence, part of …
video
The Complete Excel Guide: Beginners to Advanced
Everyone knows that the Microsoft Office suite is used by millions worldwide. Unlocking its full potential …