book

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition

Name: Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition
Author: Aurélien Géron
ISBN: 9781492032649

by Aurélien Géron

September 2019

Intermediate to advanced

848 pages

24h 18m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Preface
The Machine Learning TsunamiMachine Learning in Your ProjectsObjective and ApproachPrerequisitesRoadmapChanges in the Second EditionOther ResourcesConventions Used in This BookCode ExamplesUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
I. The Fundamentals of Machine Learning
1. The Machine Learning Landscape
What Is Machine Learning?Why Use Machine Learning?Examples of ApplicationsTypes of Machine Learning SystemsSupervised/Unsupervised LearningBatch and Online LearningInstance-Based Versus Model-Based LearningMain Challenges of Machine LearningInsufficient Quantity of Training DataNonrepresentative Training DataPoor-Quality DataIrrelevant FeaturesOverfitting the Training DataUnderfitting the Training DataStepping BackTesting and ValidatingHyperparameter Tuning and Model SelectionData MismatchExercises
2. End-to-End Machine Learning Project
Working with Real DataLook at the Big PictureFrame the ProblemSelect a Performance MeasureCheck the AssumptionsGet the DataCreate the WorkspaceDownload the DataTake a Quick Look at the Data StructureCreate a Test SetDiscover and Visualize the Data to Gain InsightsVisualizing Geographical DataLooking for CorrelationsExperimenting with Attribute CombinationsPrepare the Data for Machine Learning AlgorithmsData CleaningHandling Text and Categorical AttributesCustom TransformersFeature ScalingTransformation PipelinesSelect and Train a ModelTraining and Evaluating on the Training SetBetter Evaluation Using Cross-ValidationFine-Tune Your ModelGrid SearchRandomized SearchEnsemble MethodsAnalyze the Best Models and Their ErrorsEvaluate Your System on the Test SetLaunch, Monitor, and Maintain Your SystemTry It Out!Exercises
3. Classification
MNISTTraining a Binary ClassifierPerformance MeasuresMeasuring Accuracy Using Cross-ValidationConfusion MatrixPrecision and RecallPrecision/Recall Trade-offThe ROC CurveMulticlass ClassificationError AnalysisMultilabel ClassificationMultioutput ClassificationExercises
4. Training Models
Linear RegressionThe Normal EquationComputational ComplexityGradient DescentBatch Gradient DescentStochastic Gradient DescentMini-batch Gradient DescentPolynomial RegressionLearning CurvesRegularized Linear ModelsRidge RegressionLasso RegressionElastic NetEarly StoppingLogistic RegressionEstimating ProbabilitiesTraining and Cost FunctionDecision BoundariesSoftmax RegressionExercises
5. Support Vector Machines
Linear SVM ClassificationSoft Margin ClassificationNonlinear SVM ClassificationPolynomial KernelSimilarity FeaturesGaussian RBF KernelComputational ComplexitySVM RegressionUnder the HoodDecision Function and PredictionsTraining ObjectiveQuadratic ProgrammingThe Dual ProblemKernelized SVMsOnline SVMsExercises
6. Decision Trees
Training and Visualizing a Decision TreeMaking PredictionsEstimating Class ProbabilitiesThe CART Training AlgorithmComputational ComplexityGini Impurity or Entropy?Regularization HyperparametersRegressionInstabilityExercises
7. Ensemble Learning and Random Forests
Voting ClassifiersBagging and PastingBagging and Pasting in Scikit-LearnOut-of-Bag EvaluationRandom Patches and Random SubspacesRandom ForestsExtra-TreesFeature ImportanceBoostingAdaBoostGradient BoostingStackingExercises
8. Dimensionality Reduction
The Curse of DimensionalityMain Approaches for Dimensionality ReductionProjectionManifold LearningPCAPreserving the VariancePrincipal ComponentsProjecting Down to d DimensionsUsing Scikit-LearnExplained Variance RatioChoosing the Right Number of DimensionsPCA for CompressionRandomized PCAIncremental PCAKernel PCASelecting a Kernel and Tuning HyperparametersLLEOther Dimensionality Reduction TechniquesExercises

9. Unsupervised Learning Techniques
ClusteringK-MeansLimits of K-MeansUsing Clustering for Image SegmentationUsing Clustering for PreprocessingUsing Clustering for Semi-Supervised LearningDBSCANOther Clustering AlgorithmsGaussian MixturesAnomaly Detection Using Gaussian MixturesSelecting the Number of ClustersBayesian Gaussian Mixture ModelsOther Algorithms for Anomaly and Novelty DetectionExercises
II. Neural Networks and Deep Learning
10. Introduction to Artificial Neural Networks with Keras
From Biological to Artificial NeuronsBiological NeuronsLogical Computations with NeuronsThe PerceptronThe Multilayer Perceptron and BackpropagationRegression MLPsClassification MLPsImplementing MLPs with KerasInstalling TensorFlow 2Building an Image Classifier Using the Sequential APIBuilding a Regression MLP Using the Sequential APIBuilding Complex Models Using the Functional APIUsing the Subclassing API to Build Dynamic ModelsSaving and Restoring a ModelUsing CallbacksUsing TensorBoard for VisualizationFine-Tuning Neural Network HyperparametersNumber of Hidden LayersNumber of Neurons per Hidden LayerLearning Rate, Batch Size, and Other HyperparametersExercises
11. Training Deep Neural Networks
The Vanishing/Exploding Gradients ProblemsGlorot and He InitializationNonsaturating Activation FunctionsBatch NormalizationGradient ClippingReusing Pretrained LayersTransfer Learning with KerasUnsupervised PretrainingPretraining on an Auxiliary TaskFaster OptimizersMomentum OptimizationNesterov Accelerated GradientAdaGradRMSPropAdam and Nadam OptimizationLearning Rate SchedulingAvoiding Overfitting Through Regularizationℓ1 and ℓ2 RegularizationDropoutMonte Carlo (MC) DropoutMax-Norm RegularizationSummary and Practical GuidelinesExercises
12. Custom Models and Training with TensorFlow
A Quick Tour of TensorFlowUsing TensorFlow like NumPyTensors and OperationsTensors and NumPyType ConversionsVariablesOther Data StructuresCustomizing Models and Training AlgorithmsCustom Loss FunctionsSaving and Loading Models That Contain Custom ComponentsCustom Activation Functions, Initializers, Regularizers, and ConstraintsCustom MetricsCustom LayersCustom ModelsLosses and Metrics Based on Model InternalsComputing Gradients Using AutodiffCustom Training LoopsTensorFlow Functions and GraphsAutoGraph and TracingTF Function RulesExercises
13. Loading and Preprocessing Data with TensorFlow
The Data APIChaining TransformationsShuffling the DataPreprocessing the DataPutting Everything TogetherPrefetchingUsing the Dataset with tf.kerasThe TFRecord FormatCompressed TFRecord FilesA Brief Introduction to Protocol BuffersTensorFlow ProtobufsLoading and Parsing ExamplesHandling Lists of Lists Using the SequenceExample ProtobufPreprocessing the Input FeaturesEncoding Categorical Features Using One-Hot VectorsEncoding Categorical Features Using EmbeddingsKeras Preprocessing LayersTF TransformThe TensorFlow Datasets (TFDS) ProjectExercises
14. Deep Computer Vision Using Convolutional Neural Networks
The Architecture of the Visual CortexConvolutional LayersFiltersStacking Multiple Feature MapsTensorFlow ImplementationMemory RequirementsPooling LayersTensorFlow ImplementationCNN ArchitecturesLeNet-5AlexNetGoogLeNetVGGNetResNetXceptionSENetImplementing a ResNet-34 CNN Using KerasUsing Pretrained Models from KerasPretrained Models for Transfer LearningClassification and LocalizationObject DetectionFully Convolutional NetworksYou Only Look Once (YOLO)Semantic SegmentationExercises
15. Processing Sequences Using RNNs and CNNs
Recurrent Neurons and LayersMemory CellsInput and Output SequencesTraining RNNsForecasting a Time SeriesBaseline MetricsImplementing a Simple RNNDeep RNNsForecasting Several Time Steps AheadHandling Long SequencesFighting the Unstable Gradients ProblemTackling the Short-Term Memory ProblemExercises
16. Natural Language Processing with RNNs and Attention
Generating Shakespearean Text Using a Character RNNCreating the Training DatasetHow to Split a Sequential DatasetChopping the Sequential Dataset into Multiple WindowsBuilding and Training the Char-RNN ModelUsing the Char-RNN ModelGenerating Fake Shakespearean TextStateful RNNSentiment AnalysisMaskingReusing Pretrained EmbeddingsAn Encoder–Decoder Network for Neural Machine TranslationBidirectional RNNsBeam SearchAttention MechanismsVisual AttentionAttention Is All You Need: The Transformer ArchitectureRecent Innovations in Language ModelsExercises
17. Representation Learning and Generative Learning Using Autoencoders and GANs
Efficient Data RepresentationsPerforming PCA with an Undercomplete Linear AutoencoderStacked AutoencodersImplementing a Stacked Autoencoder Using KerasVisualizing the ReconstructionsVisualizing the Fashion MNIST DatasetUnsupervised Pretraining Using Stacked AutoencodersTying WeightsTraining One Autoencoder at a TimeConvolutional AutoencodersRecurrent AutoencodersDenoising AutoencodersSparse AutoencodersVariational AutoencodersGenerating Fashion MNIST ImagesGenerative Adversarial NetworksThe Difficulties of Training GANsDeep Convolutional GANsProgressive Growing of GANsStyleGANsExercises
18. Reinforcement Learning
Learning to Optimize RewardsPolicy SearchIntroduction to OpenAI GymNeural Network PoliciesEvaluating Actions: The Credit Assignment ProblemPolicy GradientsMarkov Decision ProcessesTemporal Difference LearningQ-LearningExploration PoliciesApproximate Q-Learning and Deep Q-LearningImplementing Deep Q-LearningDeep Q-Learning VariantsFixed Q-Value TargetsDouble DQNPrioritized Experience ReplayDueling DQNThe TF-Agents LibraryInstalling TF-AgentsTF-Agents EnvironmentsEnvironment SpecificationsEnvironment Wrappers and Atari PreprocessingTraining ArchitectureCreating the Deep Q-NetworkCreating the DQN AgentCreating the Replay Buffer and the Corresponding ObserverCreating Training MetricsCreating the Collect DriverCreating the DatasetCreating the Training LoopOverview of Some Popular RL AlgorithmsExercises
19. Training and Deploying TensorFlow Models at Scale
Serving a TensorFlow ModelUsing TensorFlow ServingCreating a Prediction Service on GCP AI PlatformUsing the Prediction ServiceDeploying a Model to a Mobile or Embedded DeviceUsing GPUs to Speed Up ComputationsGetting Your Own GPUUsing a GPU-Equipped Virtual MachineColaboratoryManaging the GPU RAMPlacing Operations and Variables on DevicesParallel Execution Across Multiple DevicesTraining Models Across Multiple DevicesModel ParallelismData ParallelismTraining at Scale Using the Distribution Strategies APITraining a Model on a TensorFlow ClusterRunning Large Training Jobs on Google Cloud AI PlatformBlack Box Hyperparameter Tuning on AI PlatformExercisesThank You!
A. Exercise Solutions
Chapter 1: The Machine Learning LandscapeChapter 2: End-to-End Machine Learning ProjectChapter 3: ClassificationChapter 4: Training ModelsChapter 5: Support Vector MachinesChapter 6: Decision TreesChapter 7: Ensemble Learning and Random ForestsChapter 8: Dimensionality ReductionChapter 9: Unsupervised Learning TechniquesChapter 10: Introduction to Artificial Neural Networks with KerasChapter 11: Training Deep Neural NetworksChapter 12: Custom Models and Training with TensorFlowChapter 13: Loading and Preprocessing Data with TensorFlowChapter 14: Deep Computer Vision Using Convolutional Neural NetworksChapter 15: Processing Sequences Using RNNs and CNNsChapter 16: Natural Language Processing with RNNs and AttentionChapter 17: Representation Learning and Generative Learning Using Autoencoders and GANsChapter 18: Reinforcement LearningChapter 19: Training and Deploying TensorFlow Models at Scale
B. Machine Learning Project Checklist
Frame the Problem and Look at the Big PictureGet the DataExplore the DataPrepare the DataShortlist Promising ModelsFine-Tune the SystemPresent Your SolutionLaunch!
C. SVM Dual Problem
D. Autodiff
Manual DifferentiationFinite Difference ApproximationForward-Mode AutodiffReverse-Mode Autodiff
E. Other Popular ANN Architectures
Hopfield NetworksBoltzmann MachinesRestricted Boltzmann MachinesDeep Belief NetsSelf-Organizing Maps
F. Special Data Structures
StringsRagged TensorsSparse TensorsTensor ArraysSetsQueues
G. TensorFlow Graphs
TF Functions and Concrete FunctionsExploring Function Definitions and GraphsA Closer Look at TracingUsing AutoGraph to Capture Control FlowHandling Variables and Other Resources in TF FunctionsUsing TF Functions with tf.keras (or Not)
Index

Content preview from Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition

Chapter 16. Natural Language Processing with RNNs and Attention

When Alan Turing imagined his famous Turing test¹ in 1950, his objective was to evaluate a machine’s ability to match human intelligence. He could have tested for many things, such as the ability to recognize cats in pictures, play chess, compose music, or escape a maze, but, interestingly, he chose a linguistic task. More specifically, he devised a chatbot capable of fooling its interlocutor into thinking it was human.² This test does have its weaknesses: a set of hardcoded rules can fool unsuspecting or naive humans (e.g., the machine could give vague predefined answers in response to some keywords; it could pretend that it is joking or drunk, to get a pass on its weirdest answers; or it could escape difficult questions by answering them with its own questions), and many aspects of human intelligence are utterly ignored (e.g., the ability to interpret nonverbal communication such as facial expressions, or to learn a manual task). But the test does highlight the fact that mastering language is arguably Homo sapiens’s greatest cognitive ability. Can we build a machine that can read and write natural language?

A common approach for natural language tasks is to use recurrent neural networks. We will therefore continue to explore RNNs (introduced in Chapter 15), starting with a character RNN, trained to predict the next character in a sentence. This will allow us to generate some original text, and in the process we ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition

Publisher Resources

ISBN: 9781492032632Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition

by Aurélien Géron

Chapter 16. Natural Language Processing with RNNs and Attention

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.