book

Python Machine Learning, Second Edition - Second Edition

Name: Python Machine Learning, Second Edition - Second Edition
ISBN: 9781787125933

by Sebastian Raschka, Jared Huffman, Vahid Mirjalili, Ryan Sun

September 2017

Intermediate to advanced

622 pages

15h 13m

English

Packt Publishing

Read now

Unlock full access

Python Machine Learning Second Edition
Table of Contents
Python Machine Learning Second Edition
Credits
About the Authors
About the Reviewers
www.PacktPub.com
eBooks, discount offers, and moreWhy subscribe?
Packt is Searching for Authors Like You
Preface
What this book covers
What you need for this book

Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example codeDownloading the color images of this bookErrataPiracyQuestions
1. Giving Computers the Ability to Learn from Data
Building intelligent machines to transform data into knowledge
The three different types of machine learning
Making predictions about the future with supervised learningClassification for predicting class labelsRegression for predicting continuous outcomesSolving interactive problems with reinforcement learningDiscovering hidden structures with unsupervised learningFinding subgroups with clusteringDimensionality reduction for data compression
Introduction to the basic terminology and notations
A roadmap for building machine learning systems
Preprocessing – getting data into shapeTraining and selecting a predictive modelEvaluating models and predicting unseen data instances
Using Python for machine learning
Installing Python and packages from the Python Package IndexUsing the Anaconda Python distribution and package managerPackages for scientific computing, data science, and machine learning
Summary
2. Training Simple Machine Learning Algorithms for Classification
Artificial neurons – a brief glimpse into the early history of machine learningThe formal definition of an artificial neuronThe perceptron learning rule
Implementing a perceptron learning algorithm in Python
An object-oriented perceptron APITraining a perceptron model on the Iris dataset
Adaptive linear neurons and the convergence of learning
Minimizing cost functions with gradient descentImplementing Adaline in PythonImproving gradient descent through feature scalingLarge-scale machine learning and stochastic gradient descent
Summary
3. A Tour of Machine Learning Classifiers Using scikit-learn
Choosing a classification algorithm
First steps with scikit-learn – training a perceptron
Modeling class probabilities via logistic regression
Logistic regression intuition and conditional probabilitiesLearning the weights of the logistic cost functionConverting an Adaline implementation into an algorithm for logistic regressionTraining a logistic regression model with scikit-learnTackling overfitting via regularization
Maximum margin classification with support vector machines
Maximum margin intuitionDealing with a nonlinearly separable case using slack variablesAlternative implementations in scikit-learn
Solving nonlinear problems using a kernel SVM
Kernel methods for linearly inseparable dataUsing the kernel trick to find separating hyperplanes in high-dimensional space
Decision tree learning
Maximizing information gain – getting the most bang for your buckBuilding a decision treeCombining multiple decision trees via random forests
K-nearest neighbors – a lazy learning algorithm
Summary
4. Building Good Training Sets – Data Preprocessing
Dealing with missing dataIdentifying missing values in tabular dataEliminating samples or features with missing valuesImputing missing valuesUnderstanding the scikit-learn estimator API
Handling categorical data
Nominal and ordinal featuresCreating an example datasetMapping ordinal featuresEncoding class labelsPerforming one-hot encoding on nominal features
Partitioning a dataset into separate training and test sets
Bringing features onto the same scale
Selecting meaningful features
L1 and L2 regularization as penalties against model complexityA geometric interpretation of L2 regularizationSparse solutions with L1 regularizationSequential feature selection algorithms
Assessing feature importance with random forests
Summary
5. Compressing Data via Dimensionality Reduction
Unsupervised dimensionality reduction via principal component analysisThe main steps behind principal component analysisExtracting the principal components step by stepTotal and explained varianceFeature transformationPrincipal component analysis in scikit-learn
Supervised data compression via linear discriminant analysis
Principal component analysis versus linear discriminant analysisThe inner workings of linear discriminant analysisComputing the scatter matricesSelecting linear discriminants for the new feature subspaceProjecting samples onto the new feature spaceLDA via scikit-learn
Using kernel principal component analysis for nonlinear mappings
Kernel functions and the kernel trickImplementing a kernel principal component analysis in PythonExample 1 – separating half-moon shapesExample 2 – separating concentric circlesProjecting new data pointsKernel principal component analysis in scikit-learn
Summary
6. Learning Best Practices for Model Evaluation and Hyperparameter Tuning
Streamlining workflows with pipelinesLoading the Breast Cancer Wisconsin datasetCombining transformers and estimators in a pipeline
Using k-fold cross-validation to assess model performance
The holdout methodK-fold cross-validation
Debugging algorithms with learning and validation curves
Diagnosing bias and variance problems with learning curvesAddressing over- and underfitting with validation curves
Fine-tuning machine learning models via grid search
Tuning hyperparameters via grid searchAlgorithm selection with nested cross-validation
Looking at different performance evaluation metrics
Reading a confusion matrixOptimizing the precision and recall of a classification modelPlotting a receiver operating characteristicScoring metrics for multiclass classification
Dealing with class imbalance
Summary
7. Combining Different Models for Ensemble Learning
Learning with ensembles
Combining classifiers via majority vote
Implementing a simple majority vote classifierUsing the majority voting principle to make predictionsEvaluating and tuning the ensemble classifier
Bagging – building an ensemble of classifiers from bootstrap samples
Bagging in a nutshellApplying bagging to classify samples in the Wine dataset
Leveraging weak learners via adaptive boosting
How boosting worksApplying AdaBoost using scikit-learn
Summary
8. Applying Machine Learning to Sentiment Analysis
Preparing the IMDb movie review data for text processingObtaining the movie review datasetPreprocessing the movie dataset into more convenient format
Introducing the bag-of-words model
Transforming words into feature vectorsAssessing word relevancy via term frequency-inverse document frequencyCleaning text dataProcessing documents into tokens
Training a logistic regression model for document classification
Working with bigger data – online algorithms and out-of-core learning
Topic modeling with Latent Dirichlet Allocation
Decomposing text documents with LDALDA with scikit-learn
Summary
9. Embedding a Machine Learning Model into a Web Application
Serializing fitted scikit-learn estimators
Setting up an SQLite database for data storage
Developing a web application with Flask
Our first Flask web applicationForm validation and renderingSetting up the directory structureImplementing a macro using the Jinja2 templating engineAdding style via CSSCreating the result page
Turning the movie review classifier into a web application
Files and folders – looking at the directory treeImplementing the main application as app.pySetting up the review formCreating a results page template
Deploying the web application to a public server
Creating a PythonAnywhere accountUploading the movie classifier applicationUpdating the movie classifier
Summary
10. Predicting Continuous Target Variables with Regression Analysis
Introducing linear regressionSimple linear regressionMultiple linear regression
Exploring the Housing dataset
Loading the Housing dataset into a data frameVisualizing the important characteristics of a datasetLooking at relationships using a correlation matrix
Implementing an ordinary least squares linear regression model
Solving regression for regression parameters with gradient descentEstimating coefficient of a regression model via scikit-learn
Fitting a robust regression model using RANSAC
Evaluating the performance of linear regression models
Using regularized methods for regression
Turning a linear regression model into a curve – polynomial regression
Adding polynomial terms using scikit-learnModeling nonlinear relationships in the Housing dataset
Dealing with nonlinear relationships using random forests
Decision tree regressionRandom forest regression
Summary
11. Working with Unlabeled Data – Clustering Analysis
Grouping objects by similarity using k-meansK-means clustering using scikit-learnA smarter way of placing the initial cluster centroids using k-means++Hard versus soft clusteringUsing the elbow method to find the optimal number of clustersQuantifying the quality of clustering via silhouette plots
Organizing clusters as a hierarchical tree
Grouping clusters in bottom-up fashionPerforming hierarchical clustering on a distance matrixAttaching dendrograms to a heat mapApplying agglomerative clustering via scikit-learn
Locating regions of high density via DBSCAN
Summary
12. Implementing a Multilayer Artificial Neural Network from Scratch
Modeling complex functions with artificial neural networksSingle-layer neural network recapIntroducing the multilayer neural network architectureActivating a neural network via forward propagation
Classifying handwritten digits
Obtaining the MNIST datasetImplementing a multilayer perceptron
Training an artificial neural network
Computing the logistic cost functionDeveloping your intuition for backpropagationTraining neural networks via backpropagation
About the convergence in neural networks
A few last words about the neural network implementation
Summary
13. Parallelizing Neural Network Training with TensorFlow
TensorFlow and training performanceWhat is TensorFlow?How we will learn TensorFlowFirst steps with TensorFlowWorking with array structuresDeveloping a simple model with the low-level TensorFlow API
Training neural networks efficiently with high-level TensorFlow APIs
Building multilayer neural networks using TensorFlow's Layers APIDeveloping a multilayer neural network with Keras
Choosing activation functions for multilayer networks
Logistic function recapEstimating class probabilities in multiclass classification via the softmax functionBroadening the output spectrum using a hyperbolic tangentRectified linear unit activation
Summary
14. Going Deeper – The Mechanics of TensorFlow
Key features of TensorFlow
TensorFlow ranks and tensors
How to get the rank and shape of a tensor
Understanding TensorFlow's computation graphs
Placeholders in TensorFlow
Defining placeholdersFeeding placeholders with dataDefining placeholders for data arrays with varying batchsizes
Variables in TensorFlow
Defining variablesInitializing variablesVariable scopeReusing variables
Building a regression model
Executing objects in a TensorFlow graph using their names
Saving and restoring a model in TensorFlow
Transforming Tensors as multidimensional data arrays
Utilizing control flow mechanics in building graphs
Visualizing the graph with TensorBoard
Extending your TensorBoard experience
Summary
15. Classifying Images with Deep Convolutional Neural Networks
Building blocks of convolutional neural networksUnderstanding CNNs and learning feature hierarchiesPerforming discrete convolutionsPerforming a discrete convolution in one dimensionThe effect of zero-padding in a convolutionDetermining the size of the convolution outputPerforming a discrete convolution in 2DSubsampling
Putting everything together to build a CNN
Working with multiple input or color channelsRegularizing a neural network with dropout
Implementing a deep convolutional neural network using TensorFlow
The multilayer CNN architectureLoading and preprocessing the dataImplementing a CNN in the TensorFlow low-level APIImplementing a CNN in the TensorFlow Layers API
Summary
16. Modeling Sequential Data Using Recurrent Neural Networks
Introducing sequential dataModeling sequential data – order mattersRepresenting sequencesThe different categories of sequence modeling
RNNs for modeling sequences
Understanding the structure and flow of an RNNComputing activations in an RNNThe challenges of learning long-range interactionsLSTM units
Implementing a multilayer RNN for sequence modeling in TensorFlow
Project one – performing sentiment analysis of IMDb movie reviews using multilayer RNNs
Preparing the dataEmbeddingBuilding an RNN modelThe SentimentRNN class constructorThe build methodStep 1 – defining multilayer RNN cellsStep 2 – defining the initial states for the RNN cellsStep 3 – creating the RNN using the RNN cells and their statesThe train methodThe predict methodInstantiating the SentimentRNN classTraining and optimizing the sentiment analysis RNN model
Project two – implementing an RNN for character-level language modeling in TensorFlow
Preparing the dataBuilding a character-level RNN modelThe constructorThe build methodThe train methodThe sample methodCreating and training the CharRNN ModelThe CharRNN model in the sampling mode
Chapter and book summary
Index

Content preview from Python Machine Learning, Second Edition - Second Edition

Summary

In this chapter, you learned about three different clustering algorithms that can help us with the discovery of hidden structures or information in data. We started this chapter with a prototype-based approach, k-means, which clusters samples into spherical shapes based on a specified number of cluster centroids. Since clustering is an unsupervised method, we do not enjoy the luxury of ground truth labels to evaluate the performance of a model. Thus, we used intrinsic performance metrics such as the elbow method or silhouette analysis as an attempt to quantify the quality of clustering.

We then looked at a different approach to clustering: agglomerative hierarchical clustering. Hierarchical clustering does not require specifying the number ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781787125933

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Python Machine Learning, Second Edition - Second Edition

by Sebastian Raschka, Jared Huffman, Vahid Mirjalili, Ryan Sun

Summary

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.