Beginning Machine Learning with PyTorch
Becoming familiar with fast neural networks
PyTorch is one of the premier libraries for programming deep neural networks in Python, or indeed in any language. Like its main open source competitor, TensorFlow, PyTorch takes advantage of GPUs and distributed clusters. For many developers and data scientists, the paradigms used in PyTorch are a more natural fit for Python and data analysis than are more graph-oriented abstractions seen elsewhere. In this first course, we introduce general concepts of machine learning and delve into general design of neural network layers of different types. We implement several practical applications of neural networks, exploring the necessary PyTorch code.
What you'll learn-and how you can apply it
- What is machine learning?
- What are neural networks and PyTorch?
- Distributing and accelerating computation with GPUs and clusters.
- Classification, regression, and clustering with NNs.
- Overview of advanced applications of NNs.
This training course is for you because...
- You are an aspiring or beginning data scientist.
- You have a comfortable intermediate-level knowledge of Python and a very basic familiarity with statistics and linear algebra.
- You are a working programmer or student who is motivated to expand your skills to include machine learning with Python.
- You have heard about the enormous promise and power of deep neural networks.
- A first course in Python and/or working experience as a programmer
- College-level basic mathematics
Students should have a system with Jupyter notebooks installed, a recent version of scikit-learn, along with Pandas, NumPy, and matplotlib, and the general scientific Python tool stack. The training materials will be made available as notebooks, here: https://github.com/DavidMertz/PyTorch-webinar.
PyTorch often works vastly faster when utilizing a CUDA GPU to perform training. Students who wish to be able to follow along running the material on their own machines in real time, are advised to obtain access to a GPU machine while attending this webinar.
Numerous cloud services provide access to rented GPU instances are reasonable hourly costs. AWS EC2 instances are very well known, and can be leased with good GPU configurations. The author is very fond of a service called vast.ai (https://vast.ai/) that he will use during presentation of the webinar. Of course, if you have any moderately recent CUDA-enabled GPU on your home or work machine, you will be fine also.
These resources are optional, but helpful if you need a refresher on Python, Jupyter Notebooks, or Pandas:
- (video) Python Programming Language LiveLessons by David Beazley
- (video) Modern Python LiveLessons: Big Ideas and Little Code in Python by Ramond Hettinger
- (video) Using Jupyter Notebooks for Data Science Analysis in Python LiveLessons by Jamie Whitacre
- (video) Pandas Data Analysis with Python Fundamentals by Daniel Y. Chen
- (Live Online Training) Beginner Machine Learning with scikit-learn by David Mertz - dates vary; search Safari to register
- (Live Online Training) Intermediate Machine Learning with scikit-learn by David Mertz - dates vary; search Safari to register
- (book) Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurélien Géron
- (book) Introduction to Machine Learning with Python by Sarah Guido, Andreas C. Müller
About your instructor
David Mertz was most recently a Senior Trainer and Senior Software Developer for Anaconda, Inc., in which role he created and structured its training program. He was a Director of the Python Software Foundation (PSF) for six years and remains co-chair of its Trademarks Committee and of the PSF Scientific Python Working Group. David worked for nine years with D. E. Shaw Research, some folks who built the world's fastest, highly-specialized (down to the ASICs and network layer) supercomputer for performing molecular dynamics.
David wrote the widely read columns Charming Python and XML Matters for IBM developerWorks, short books for O'Reilly, and the Addison-Wesley book Text Processing in Python. He has spoken at multiple OSCons, PyCons, and AnacondaCon, and was invited to be a keynote speaker at PyCon-India, PyCon-UK, PyCon-ZA, PyCon Belarus, PyCon Cuba, and PyData SF.
David is pleased to find Python becoming the default high-level language for most scientific computing projects.
The timeframes are only estimates and may vary according to how the class is progressing
Lesson 1: What is Machine Learning? What is Deep Learning? (1.5 hours)
- Understand the difference between "deep learning" and other ML techniques
- Describe the techniques used in machine learning
- Understand classification versus regression versus clustering
- Perform dimensionality reduction
- Explain feature engineering
- Utilize feature selection
- Distinguish categorical versus ordinal versus continuous variables
- Perform one-hot encoding
- Types of network layers
Lesson 2: Understanding PyTorch (1 hour)
- Tensors and NumPy interfaces
- Using GPUs with torch.cuda
- Parallelizing on clusters with torch.distributed
- Create a neural network with torch.nn
- Differences from TensorFlow, Keras, etc.
Lesson 3: Tasks with Networks (1.5 hours)
- An image classifier
- A regression prediction
- Clustering with NNs (note: https://github.com/MarcTLaw/DeepSpectralClusteringToy)
- Generative Adversarial Networks (GAN)
- Reinforcement Learning