book

Hands-On Machine Learning with Scikit-Learn and TensorFlow

Name: Hands-On Machine Learning with Scikit-Learn and TensorFlow
Author: Aurélien Géron
ISBN: 9781491962299

by Aurélien Géron

March 2017

Intermediate to advanced

572 pages

15h 56m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Preface
The Machine Learning TsunamiMachine Learning in Your ProjectsObjective and ApproachPrerequisitesRoadmapOther ResourcesConventions Used in This BookUsing Code ExamplesO’Reilly SafariHow to Contact UsAcknowledgments
I. The Fundamentals of Machine Learning
1. The Machine Learning Landscape
What Is Machine Learning?Why Use Machine Learning?Types of Machine Learning SystemsSupervised/Unsupervised LearningBatch and Online LearningInstance-Based Versus Model-Based LearningMain Challenges of Machine LearningInsufficient Quantity of Training DataNonrepresentative Training DataPoor-Quality DataIrrelevant FeaturesOverfitting the Training DataUnderfitting the Training DataStepping BackTesting and ValidatingExercises
2. End-to-End Machine Learning Project
Working with Real DataLook at the Big PictureFrame the ProblemSelect a Performance MeasureCheck the AssumptionsGet the DataCreate the WorkspaceDownload the DataTake a Quick Look at the Data StructureCreate a Test SetDiscover and Visualize the Data to Gain InsightsVisualizing Geographical DataLooking for CorrelationsExperimenting with Attribute CombinationsPrepare the Data for Machine Learning AlgorithmsData CleaningHandling Text and Categorical AttributesCustom TransformersFeature ScalingTransformation PipelinesSelect and Train a ModelTraining and Evaluating on the Training SetBetter Evaluation Using Cross-ValidationFine-Tune Your ModelGrid SearchRandomized SearchEnsemble MethodsAnalyze the Best Models and Their ErrorsEvaluate Your System on the Test SetLaunch, Monitor, and Maintain Your SystemTry It Out!Exercises
3. Classification
MNISTTraining a Binary ClassifierPerformance MeasuresMeasuring Accuracy Using Cross-ValidationConfusion MatrixPrecision and RecallPrecision/Recall TradeoffThe ROC CurveMulticlass ClassificationError AnalysisMultilabel ClassificationMultioutput ClassificationExercises
4. Training Models
Linear RegressionThe Normal EquationComputational ComplexityGradient DescentBatch Gradient DescentStochastic Gradient DescentMini-batch Gradient DescentPolynomial RegressionLearning CurvesRegularized Linear ModelsRidge RegressionLasso RegressionElastic NetEarly StoppingLogistic RegressionEstimating ProbabilitiesTraining and Cost FunctionDecision BoundariesSoftmax RegressionExercises
5. Support Vector Machines
Linear SVM ClassificationSoft Margin ClassificationNonlinear SVM ClassificationPolynomial KernelAdding Similarity FeaturesGaussian RBF KernelComputational ComplexitySVM RegressionUnder the HoodDecision Function and PredictionsTraining ObjectiveQuadratic ProgrammingThe Dual ProblemKernelized SVMOnline SVMsExercises
6. Decision Trees
Training and Visualizing a Decision TreeMaking PredictionsEstimating Class ProbabilitiesThe CART Training AlgorithmComputational ComplexityGini Impurity or Entropy?Regularization HyperparametersRegressionInstabilityExercises
7. Ensemble Learning and Random Forests
Voting ClassifiersBagging and PastingBagging and Pasting in Scikit-LearnOut-of-Bag EvaluationRandom Patches and Random SubspacesRandom ForestsExtra-TreesFeature ImportanceBoostingAdaBoostGradient BoostingStackingExercises
8. Dimensionality Reduction
The Curse of DimensionalityMain Approaches for Dimensionality ReductionProjectionManifold LearningPCAPreserving the VariancePrincipal ComponentsProjecting Down to d DimensionsUsing Scikit-LearnExplained Variance RatioChoosing the Right Number of DimensionsPCA for CompressionRandomized PCAIncremental PCAKernel PCASelecting a Kernel and Tuning HyperparametersLLEOther Dimensionality Reduction TechniquesExercises

II. Neural Networks and Deep Learning
9. Up and Running with TensorFlow
InstallationCreating Your First Graph and Running It in a SessionManaging GraphsLifecycle of a Node ValueLinear Regression with TensorFlowImplementing Gradient DescentManually Computing the GradientsUsing autodiffUsing an OptimizerFeeding Data to the Training AlgorithmSaving and Restoring ModelsVisualizing the Graph and Training Curves Using TensorBoardName ScopesModularitySharing VariablesExercises
10. Introduction to Artificial Neural Networks
From Biological to Artificial NeuronsBiological NeuronsLogical Computations with NeuronsThe PerceptronMulti-Layer Perceptron and BackpropagationTraining an MLP with TensorFlow’s High-Level APITraining a DNN Using Plain TensorFlowConstruction PhaseExecution PhaseUsing the Neural NetworkFine-Tuning Neural Network HyperparametersNumber of Hidden LayersNumber of Neurons per Hidden LayerActivation FunctionsExercises
11. Training Deep Neural Nets
Vanishing/Exploding Gradients ProblemsXavier and He InitializationNonsaturating Activation FunctionsBatch NormalizationGradient ClippingReusing Pretrained LayersReusing a TensorFlow ModelReusing Models from Other FrameworksFreezing the Lower LayersCaching the Frozen LayersTweaking, Dropping, or Replacing the Upper LayersModel ZoosUnsupervised PretrainingPretraining on an Auxiliary TaskFaster OptimizersMomentum OptimizationNesterov Accelerated GradientAdaGradRMSPropAdam OptimizationLearning Rate SchedulingAvoiding Overfitting Through RegularizationEarly Stoppingℓ1 and ℓ2 RegularizationDropoutMax-Norm RegularizationData AugmentationPractical GuidelinesExercises
12. Distributing TensorFlow Across Devices and Servers
Multiple Devices on a Single MachineInstallationManaging the GPU RAMPlacing Operations on DevicesParallel ExecutionControl DependenciesMultiple Devices Across Multiple ServersOpening a SessionThe Master and Worker ServicesPinning Operations Across TasksSharding Variables Across Multiple Parameter ServersSharing State Across Sessions Using Resource ContainersAsynchronous Communication Using TensorFlow QueuesLoading Data Directly from the GraphParallelizing Neural Networks on a TensorFlow ClusterOne Neural Network per DeviceIn-Graph Versus Between-Graph ReplicationModel ParallelismData ParallelismExercises
13. Convolutional Neural Networks
The Architecture of the Visual CortexConvolutional LayerFiltersStacking Multiple Feature MapsTensorFlow ImplementationMemory RequirementsPooling LayerCNN ArchitecturesLeNet-5AlexNetGoogLeNetResNetExercises
14. Recurrent Neural Networks
Recurrent NeuronsMemory CellsInput and Output SequencesBasic RNNs in TensorFlowStatic Unrolling Through TimeDynamic Unrolling Through TimeHandling Variable Length Input SequencesHandling Variable-Length Output SequencesTraining RNNsTraining a Sequence ClassifierTraining to Predict Time SeriesCreative RNNDeep RNNsDistributing a Deep RNN Across Multiple GPUsApplying DropoutThe Difficulty of Training over Many Time StepsLSTM CellPeephole ConnectionsGRU CellNatural Language ProcessingWord EmbeddingsAn Encoder–Decoder Network for Machine TranslationExercises
15. Autoencoders
Efficient Data RepresentationsPerforming PCA with an Undercomplete Linear AutoencoderStacked AutoencodersTensorFlow ImplementationTying WeightsTraining One Autoencoder at a TimeVisualizing the ReconstructionsVisualizing FeaturesUnsupervised Pretraining Using Stacked AutoencodersDenoising AutoencodersTensorFlow ImplementationSparse AutoencodersTensorFlow ImplementationVariational AutoencodersGenerating DigitsOther AutoencodersExercises
16. Reinforcement Learning
Learning to Optimize RewardsPolicy SearchIntroduction to OpenAI GymNeural Network PoliciesEvaluating Actions: The Credit Assignment ProblemPolicy GradientsMarkov Decision ProcessesTemporal Difference Learning and Q-LearningExploration PoliciesApproximate Q-Learning and Deep Q-LearningLearning to Play Ms. Pac-Man Using the DQN AlgorithmExercisesThank You!
A. Exercise Solutions
Chapter 1: The Machine Learning LandscapeChapter 2: End-to-End Machine Learning ProjectChapter 3: ClassificationChapter 4: Training ModelsChapter 5: Support Vector MachinesChapter 6: Decision TreesChapter 7: Ensemble Learning and Random ForestsChapter 8: Dimensionality ReductionChapter 9: Up and Running with TensorFlowChapter 10: Introduction to Artificial Neural NetworksChapter 11: Training Deep Neural NetsChapter 12: Distributing TensorFlow Across Devices and ServersChapter 13: Convolutional Neural NetworksChapter 14: Recurrent Neural NetworksChapter 15: AutoencodersChapter 16: Reinforcement Learning
B. Machine Learning Project Checklist
Frame the Problem and Look at the Big PictureGet the DataExplore the DataPrepare the DataShort-List Promising ModelsFine-Tune the SystemPresent Your SolutionLaunch!
C. SVM Dual Problem
D. Autodiff
Manual DifferentiationSymbolic DifferentiationNumerical DifferentiationForward-Mode AutodiffReverse-Mode Autodiff
E. Other Popular ANN Architectures
Hopfield NetworksBoltzmann MachinesRestricted Boltzmann MachinesDeep Belief NetsSelf-Organizing Maps
Index

Content preview from Hands-On Machine Learning with Scikit-Learn and TensorFlow

Chapter 11. Training Deep Neural Nets

In Chapter 10 we introduced artificial neural networks and trained our first deep neural network. But it was a very shallow DNN, with only two hidden layers. What if you need to tackle a very complex problem, such as detecting hundreds of types of objects in high-resolution images? You may need to train a much deeper DNN, perhaps with (say) 10 layers, each containing hundreds of neurons, connected by hundreds of thousands of connections. This would not be a walk in the park:

First, you would be faced with the tricky vanishing gradients problem (or the related exploding gradients problem) that affects deep neural networks and makes lower layers very hard to train.
Second, with such a large network, training would be extremely slow.
Third, a model with millions of parameters would severely risk overfitting the training set.

In this chapter, we will go through each of these problems in turn and present techniques to solve them. We will start by explaining the vanishing gradients problem and exploring some of the most popular solutions to this problem. Next we will look at various optimizers that can speed up training large models tremendously compared to plain Gradient Descent. Finally, we will go through a few popular regularization techniques for large neural networks.

With these tools, you will be able to train very deep nets: welcome to Deep Learning!

Vanishing/Exploding Gradients Problems

As we discussed in Chapter 10, the backpropagation ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Deep Learning with TensorFlow, Keras, and PyTorch

Publisher Resources

ISBN: 9781491962282Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Hands-On Machine Learning with Scikit-Learn and TensorFlow

by Aurélien Géron

Chapter 11. Training Deep Neural Nets

Vanishing/Exploding Gradients Problems

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.