O'Reilly logo
live online training icon Live Online training

Essential Machine Learning and Exploratory Data Analysis with Python and Jupyter Notebook

Learn just the essentials of Python-based Machine Learning on AWS with Jupyter Notebook

Noah Gift

There is an overwhelming demand to learn business focused Python-based Machine Learning. This training is about learning how to apply Machine Learning techniques in Python to common business applications. Examples of this could be classifying types of users registered on a shopping site, to using regression to predict the sales for the next month.

The live training shows how to get started with the basics in Python via Jupyter notebooks, then proceeds to dive into nuts and bolts of Data Science libraries in Python. EDA, or exploratory data analysis, is at the heart of the Machine Learning feedback look, and this series will highlight how to perform this in Python and Jupyter Notebook.

Finally, AWS will be used to expand the machine learning concepts to real world environments in the cloud. Machine Learning on AWS concepts will cover how to do batch based job workflows for Machine Learning pipelines, as well as the use of the boto library.

What you'll learn-and how you can apply it

  • Python fundamentals
  • Jupyter notebook fundamentals with Pandas, scikit-learn, and seaborn
  • AWS fundamentals for Python and Machine Learning
  • Machine Learning concepts and applications

This training course is for you because...

  • You are a business and analytics professional with some SQL experience and are looking to move to the next generation of Data Science.
  • You are a Junior Data Scientist who is looking to expand into cloud-based Machine Learning concepts on AWS.
  • You’re a software developer who wants to understand how to get more deeply involved in the Data Science movement.
  • You’re a technical leader who wants to understand Machine Learning in Python to effectively manage teams that perform these actions.
  • You’re a currently involved in Data Science, Analytics or Machine Learning training and are looking for additional material to supplement your learning.


  • Some previous programming experiences
  • Basic understanding of statistics and probability

Recommended preparation:

Students should go through the tutorial on this page: https://github.com/noahgift/functional_intro_to_python

Modern Python LiveLessons: Big Ideas and Little Code in Python (video)

Data Science Fundamentals Part 1: Learning Basic Concepts, Data Wrangling, and Databases with Python (video)

Pandas Data Cleaning and Modeling with Python (video)

Python: Essential Reference (book)

Course Set-up:

  • Jupyter notebook
  • Python 3.6
  • (Optional) AWS account

Resources List:


About your instructor

  • Noah Gift is lecturer and consultant at both UC Davis Graduate School of Management MSBA program and the Graduate Data Science program, MSDS, at Northwestern. He is teaching and designing graduate machine learning, AI, Data Science courses and consulting on Machine Learning and Cloud Architecture for students and faculty. These responsibilities including leading a multi-cloud certification initiative for students. He has published close to 100 technical publications including two books on subjects ranging from Cloud Machine Learning to DevOps. Gift received an MBA from UC Davis, a M.S. in Computer Information Systems from Cal State Los Angeles, and a B.S. in Nutritional Science from Cal Poly San Luis Obispo.

    Professionally, Noah has approximately 20 years’ experience programming in Python. He is a Python Software Foundation Fellow, AWS Subject Matter Expert (SME) on Machine Learning, AWS Certified Solutions Architect and AWS Academy Accredited Instructor, Google Certified Professional Cloud Architect, Microsoft MTA on Python. He has worked in roles ranging from CTO, General Manager, Consulting CTO and Cloud Architect. This experience has been with a wide variety of companies including ABC, Caltech, Sony Imageworks, Disney Feature Animation, Weta Digital, AT&T, Turner Studios and Linden Lab. In the last ten years, he has been responsible for shipping many new products at multiple companies that generated millions of dollars of revenue and had global scale. Currently he is consulting startups and other companies.


The timeframes are only estimates and may vary according to how the class is progressing

Day 1

Part 1: Introductory Concepts in Python and Functions Using Jupyter Notebook (180 minutes)

  • Introductory Concepts
  • IPython and Python REPL
  • Procedural statements
  • Strings and String formatting
  • Numbers and arithmetic operations
  • Data Structures: Lists, Dictionaries, Sets and operations on them.
  • Writing and Running Scripts
  • Functions
  • Writing Functions
  • Function arguments: positional, keyword
  • Functional Currying: Passing uncalled functions
  • Functions that Yield
  • Decorators: Functions that wrap other functions
  • Lambdas

Q&A: 15 Minutes

Break: 15 Minutes

Part Two: Intermediate Topics (1 Hour + 15 Minutes)

  • Intermediate Topics
  • Modules
  • Writing a library in python
  • Importing a library in python and using namespaces
  • Using other libraries with pip install.
  • Mixing third party libraries with your code.
  • Classes
  • Making simple objects and interacting with them
  • Writing classes basics
  • Differences between classes and functions and schools of thought on functional vs Object Oriented programming.
  • Control Structures
  • For loops
  • While loops
  • If/else statements
  • Try/except
  • Generator expressions
  • List Comprehensions

Q&A: 15 Minutes

Day 2

Applied Python for AWS for Data Science and ML (180 minutes)

  • Part One: IO Operations in Python and Pandas: 1.5 Hours
  • Writing a file
  • Reading a file
  • Using subprocessing and multiprocessing
  • Reading and Writing YAML Files
  • Reading and Writing DataFrames in Pandas
  • Joining, Merging and Querying DataFrames in Pandas
  • Walkthrough: Walk through Social Power NBA EDA and ML Project
  • Importing and merging DataFrames in Pandas
  • Creating correlation heatmaps
  • Using seaborn lmplot
  • Using linear regression in Python
  • Using ggplot
  • Doing KMeans clustering
  • Using Plotly for interactive Data Visualization

Q&A: 15 Minutes

Break: 15 Minutes

Part Two: (1 Hour + 15 Minutes)

  • Applied Python and Cloud Basics
  • Introduction to AWS Web Services: Creating accounts, Creating Users and Using Amazon S3
  • Brief overview of AWS Python Lambda development with Chalice
  • Overview of Step functions with AWS
  • Overview of AWS Batch for ML Jobs
  • Software Carpentry
  • Using Git and Github to manage changes
  • Using CircleCI to build and test project sourced from Github
  • Using Static Analysis and Testing tools: Pylint and Pytest
  • Testing Jupyter Notebook

Q&A: 15 Minutes