book

Python Machine Learning by Example - Third Edition

by Yuxi (Hayden) Liu

October 2020

Beginner to intermediate

526 pages

12h 6m

English

Packt Publishing

Read now

Unlock full access

Who this book is forWhat this book coversTo get the most out of this bookGet in touch
An introduction to machine learningUnderstanding why we need machine learningDifferentiating between machine learning and automationMachine learning applicationsKnowing the prerequisitesGetting started with three types of machine learningA brief history of the development of machine learning algorithmsDigging into the core of machine learningGeneralizing with dataOverfitting, underfitting, and the bias-variance trade-offOverfittingUnderfittingThe bias-variance trade-offAvoiding overfitting with cross-validationAvoiding overfitting with regularizationAvoiding overfitting with feature selection and dimensionality reductionData preprocessing and feature engineeringPreprocessing and explorationDealing with missing valuesLabel encodingOne-hot encodingScalingFeature engineeringPolynomial transformationPower transformsBinningCombining modelsVoting and averagingBaggingBoostingStackingInstalling software and setting upSetting up Python and environmentsInstalling the main Python packagesNumPySciPyPandasScikit-learnTensorFlowIntroducing TensorFlow 2SummaryExercises
Getting started with classificationBinary classificationMulticlass classificationMulti-label classificationExploring Naïve BayesLearning Bayes' theorem by exampleThe mechanics of Naïve BayesImplementing Naïve BayesImplementing Naïve Bayes from scratchImplementing Naïve Bayes with scikit-learnBuilding a movie recommender with Naïve BayesEvaluating classification performance Tuning models with cross-validationSummaryExerciseReferences
Finding the separating boundary with SVMScenario 1 – identifying a separating hyperplaneScenario 2 – determining the optimal hyperplaneScenario 3 – handling outliersImplementing SVMScenario 4 – dealing with more than two classesScenario 5 – solving linearly non-separable problems with kernelsChoosing between linear and RBF kernelsClassifying face images with SVMExploring the face image datasetBuilding an SVM-based image classifierBoosting image classification performance with PCAFetal state classification on cardiotocographySummaryExercises
A brief overview of ad click-through predictionGetting started with two types of data – numerical and categoricalExploring a decision tree from the root to the leavesConstructing a decision treeThe metrics for measuring a splitGini ImpurityInformation GainImplementing a decision tree from scratchImplementing a decision tree with scikit-learnPredicting ad click-through with a decision treeEnsembling decision trees – random forestEnsembling decision trees – gradient boosted treesSummaryExercises
Converting categorical features to numerical—one-hot encoding and ordinal encodingClassifying data with logistic regressionGetting started with the logistic functionJumping from the logistic function to logistic regressionTraining a logistic regression modelTraining a logistic regression model using gradient descentPredicting ad click-through with logistic regression using gradient descentTraining a logistic regression model using stochastic gradient descentTraining a logistic regression model with regularizationFeature selection using L1 regularizationTraining on large datasets with online learningHandling multiclass classificationImplementing logistic regression using TensorFlowFeature selection using random forestSummaryExercises
Learning the essentials of Apache SparkBreaking down SparkInstalling SparkLaunching and deploying Spark programsProgramming in PySparkLearning on massive click logs with SparkLoading click logsSplitting and caching the dataOne-hot encoding categorical featuresTraining and testing a logistic regression modelFeature engineering on categorical variables with SparkHashing categorical featuresCombining multiple variables – feature interactionSummaryExercises
A brief overview of the stock market and stock pricesWhat is regression?Mining stock price dataGetting started with feature engineeringAcquiring data and generating featuresEstimating with linear regressionHow does linear regression work?Implementing linear regression from scratchImplementing linear regression with scikit-learnImplementing linear regression with TensorFlowEstimating with decision tree regressionTransitioning from classification trees to regression treesImplementing decision tree regressionImplementing a regression forestEstimating with support vector regressionImplementing SVREvaluating regression performancePredicting stock prices with the three regression algorithmsSummaryExercises
Demystifying neural networksStarting with a single-layer neural networkLayers in neural networksActivation functionsBackpropagationAdding more layers to a neural network: DLBuilding neural networksImplementing neural networks from scratchImplementing neural networks with scikit-learnImplementing neural networks with TensorFlowPicking the right activation functionsPreventing overfitting in neural networksDropout Early stopping Predicting stock prices with neural networksTraining a simple neural networkFine-tuning the neural network SummaryExercise
How computers understand language – NLPWhat is NLP?The history of NLPNLP applicationsTouring popular NLP libraries and picking up NLP basicsInstalling famous NLP librariesCorporaTokenizationPoS taggingNERStemming and lemmatizationSemantics and topic modelingGetting the newsgroups dataExploring the newsgroups dataThinking about features for text dataCounting the occurrence of each word tokenText preprocessingDropping stop wordsReducing inflectional and derivational forms of wordsVisualizing the newsgroups data with t-SNEWhat is dimensionality reduction?t-SNE for dimensionality reductionSummaryExercises

Learning without guidance – unsupervised learningClustering newsgroups data using k-meansHow does k-means clustering work?Implementing k-means from scratchImplementing k-means with scikit-learnChoosing the value of kClustering newsgroups data using k-meansDiscovering underlying topics in newsgroupsTopic modeling using NMFTopic modeling using LDASummaryExercises
Machine learning solution workflowBest practices in the data preparation stageBest practice 1 – Completely understanding the project goalBest practice 2 – Collecting all fields that are relevantBest practice 3 – Maintaining the consistency of field valuesBest practice 4 – Dealing with missing dataBest practice 5 – Storing large-scale dataBest practices in the training sets generation stageBest practice 6 – Identifying categorical features with numerical valuesBest practice 7 – Deciding whether to encode categorical featuresBest practice 8 – Deciding whether to select features, and if so, how to do soBest practice 9 – Deciding whether to reduce dimensionality, and if so, how to do soBest practice 10 – Deciding whether to rescale featuresBest practice 11 – Performing feature engineering with domain expertiseBest practice 12 – Performing feature engineering without domain expertiseBinarizationDiscretizationInteractionPolynomial transformationBest practice 13 – Documenting how each feature is generatedBest practice 14 – Extracting features from text dataTf and tf-idfWord embeddingWord embedding with pre-trained modelsBest practices in the model training, evaluation, and selection stageBest practice 15 – Choosing the right algorithm(s) to start withNaïve BayesLogistic regressionSVMRandom forest (or decision tree)Neural networksBest practice 16 – Reducing overfittingBest practice 17 – Diagnosing overfitting and underfittingBest practice 18 – Modeling on large-scale datasetsBest practices in the deployment and monitoring stageBest practice 19 – Saving, loading, and reusing modelsSaving and restoring models using pickleSaving and restoring models in TensorFlowBest practice 20 – Monitoring model performanceBest practice 21 – Updating models regularlySummaryExercises
Getting started with CNN building blocksThe convolutional layerThe nonlinear layerThe pooling layerArchitecting a CNN for classificationExploring the clothing image dataset Classifying clothing images with CNNsArchitecting the CNN modelFitting the CNN modelVisualizing the convolutional filtersBoosting the CNN classifier with data augmentation Horizontal flipping for data augmentationRotation for data augmentationShifting for data augmentationImproving the clothing image classifier with data augmentationSummaryExercises
Introducing sequential learningLearning the RNN architecture by exampleRecurrent mechanismMany-to-one RNNsOne-to-many RNNs Many-to-many (synced) RNNsMany-to-many (unsynced) RNNsTraining an RNN modelOvercoming long-term dependencies with Long Short-Term MemoryAnalyzing movie review sentiment with RNNsAnalyzing and preprocessing the dataBuilding a simple LSTM networkStacking multiple LSTM layersWriting your own War and Peace with RNNsAcquiring and analyzing the training dataConstructing the training set for the RNN text generatorBuilding an RNN text generatorTraining the RNN text generatorAdvancing language understanding with the Transformer modelExploring the Transformer's architectureUnderstanding self-attention SummaryExercises
Setting up the working environmentInstalling PyTorchInstalling OpenAI GymIntroducing reinforcement learning with examplesElements of reinforcement learningCumulative rewardsApproaches to reinforcement learningSolving the FrozenLake environment with dynamic programmingSimulating the FrozenLake environmentSolving FrozenLake with the value iteration algorithmSolving FrozenLake with the policy iteration algorithmPerforming Monte Carlo learningSimulating the Blackjack environmentPerforming Monte Carlo policy evaluationPerforming on-policy Monte Carlo controlSolving the Taxi problem with the Q-learning algorithmSimulating the Taxi environmentDeveloping the Q-learning algorithmSummaryExercises

Content preview from Python Machine Learning by Example - Third Edition

Other Books You May Enjoy

If you enjoyed this book, you may be interested in these other books by Packt:

Python Machine Learning - Third Edition

Sebastian Raschka, Vahid Mirjalili

ISBN: 978-1-78995-575-0

Master the frameworks, models, and techniques that enable machines to 'learn' from data
Use scikit-learn for machine learning and TensorFlow for deep learning
Apply machine learning to image classification, sentiment analysis, intelligent web applications, and more
Build and train neural networks, GANs, and other models
Discover best practices for evaluating and tuning models
Predict continuous target outcomes using regression analysis
Dig deeper ...