Getting Started with Machine Learning in Python

Video description

Machine Learning is a hot topic. And you want to get involved! From developers to analysts, this course aims to bring Machine Learning to those with coding experience and numerical skills.

In this course, we introduce, via intuition rather than theory, the core of what makes Machine Learning work. Learn how to use labeled datasets to classify objects or predict future values, so that you can provide more accurate and valuable analysis. Use unlabelled datasets to do segmentation and clustering, so that you can separate a large dataset into sensible groups.

You will learn to understand and estimate the value of your dataset. We guide you through creating the best performance metric for your task at hand, and how that takes you to the correct model to solve your problem. Understand how to clean data for your application, and how to recognize which Machine Learning task you are dealing with.

If you want to move past Excel and if-then-else into automatically learned ML solutions, this course is for you!

This course uses Python 3.6, while not the latest version available, it provides relevant and informative content for legacy users of Python.

What You Will Learn

  • Core concepts of Machine Learning so you can understand fellow data scientists.
  • Clean your data to optimize how it feeds into your Machine-Learning models.
  • Perform regression in a supervised learning setting, so that you can predict numbers, prices, and conversion rates.
  • Perform classification in a supervised-learning setting, teaching the model to distinguish between different plants, discussion topics, and objects.
  • Use decision tree models and random forests, creating models that are explainable but powerful.
  • Go past linear models with SVMs and polynomial regression, tackling relationships that are non-linear.
  • Measure and evaluate your Machine-Learning pipeline, so that you can improve your solution over time.


This course is for anyone, with a little coding experience and basic numerical skills, who wants to go beyond hardcoded, rule-based programming and use their datasets to automatically learn new algorithms that solve problems. From developers to analysts, this course aims to bring Machine Learning to everyone. It uses intuition as a base from which to explain the theory behind Machine Learning and its algorithms. Basic Python skills are assumed.

About The Author

Colibri Digital: Colibri is a technology consultancy company founded in 2015 by James Cross and Ingrid Funie. The company works to help its clients navigate the rapidly changing and complex world of emerging technologies, with deep expertise in areas like big data, data science, machine learning, and cloud computing. Over the past few years, they have worked with some of the world's largest and most prestigious companies, including a tier 1 investment bank, a leading management consultancy group, and one of the world's most popular soft drinks companies, helping each of them to make better sense of its data, and process it in more intelligent ways. The company lives by its motto: Data -> Intelligence -> Action.

James Cross is a Big Data Engineer and certified AWS Solutions Architect with a passion for data-driven applications. He's spent the last 3-5 years helping his clients to design and implement huge-scale, streaming big data platforms, cloud-based analytics stacks, and serverless architectures.

He started his professional career in Investment Banking, working with well-established technologies such as Java and SQL Server, before moving into the Big Data space. Since then he's worked with a huge range of big data tools including most of the Hadoop eco-system, Spark, and many No-SQL technologies such as Cassandra, MongoDB, Redis, and DynamoDB. More recently his focus has been on cloud technologies and how they can be applied to data analytics, culminating in his work at Scout Solutions as CTO, and more recently with Mckinsey.

James is an AWS certified solutions architect with several years' experience designing and implementing solutions on this cloud platform. As CTO of Scout Solutions Ltd, he built a fully serverless set of APIs and an analytics stack based around Lambda and Redshift.

Table of contents

  1. Chapter 1 : Launching a Python Environment to Create Machine Learning Models
    1. The Course Overview
    2. Machine Learning versus Rule-Based Programming
    3. Understanding What Machine Learning Can Do Using the Tasks Framework
    4. Creating Machine-Learned Models with Python and scikit-learn
    5. Supervised Versus Unsupervised Learning
  2. Chapter 2 : Prepare Your Datasets for Machine Learning with Data Cleaning
    1. In this video, we will fix your machine learning models by understanding your data source
    2. Dealing with Missing Values – An Example
    3. Standardization and Normalization to Deal with Variables with Different Scales
    4. Eliminating Duplicate Entries
  3. Chapter 3 : Put Data into Their Right Categories with Classification
    1. How Do We Learn Rules to Classify Objects?
    2. Understanding Logistic Regression – Your First Classifier
    3. Applying Logistic Regression to the Iris Classification Task
    4. Closing Our First Machine Learning Pipeline with a Simple Model Evaluator
  4. Chapter 4 : Predict Numbers in the Future with Regression
    1. Creating Formulas That Predict the Future – A House Price Example
    2. Understanding Linear Regression – Your First Regressor
    3. Applying Linear Regression to the Boston House Price Task
    4. Evaluating Numerical Predictions with Least Squares
  5. Chapter 5 : Unsupervised Learning: Segmenting Groups and Detecting Outliers
    1. Exploring Unsupervised Learning and Its Usefulness
    2. Finding Groups Automatically with K-means Clustering
    3. Reducing the Number of Variables in Your Data with PCA
    4. Smooth out Your Histograms with Kernel Density Estimation
  6. Chapter 6 : Modeling Complex Relationships with Nonlinear Models
    1. Create Explainable Models with Decision Trees
    2. Automatic Feature Engineering with Support Vector Machines
    3. Deal with Nonlinear Relationships with Polynomial Regression
    4. Reduce the Number of Learned Rules with Regularization

Product information

  • Title: Getting Started with Machine Learning in Python
  • Author(s): Rudy Lai
  • Release date: September 2018
  • Publisher(s): Packt Publishing
  • ISBN: 9781788477437