book

Learning Deep Learning: Theory and Practice of Neural Networks, Computer Vision, NLP, and Transformers using TensorFlow

Name: Learning Deep Learning: Theory and Practice of Neural Networks, Computer Vision, NLP, and Transformers using TensorFlow
Author: Magnus Ekman
ISBN: 9780137470198

by Magnus Ekman

August 2021

Intermediate to advanced

752 pages

21h 59m

English

Addison-Wesley Professional

Read now

Unlock full access

Includes

Sandbox

Cover Page
About This eBook
Halftitle Page
Title Page
Copyright Page
Dedication Page
Contents
Foreword
Foreword
Preface
What Is Deep Learning?Brief History of Deep Neural NetworksIs This Book for You?Is DL Dangerous?Choosing a DL FrameworkPrerequisites for Learning DLAbout the Code ExamplesHow to Read This BookOverview of Each Chapter and Appendix

Acknowledgments
About the Author
Chapter 1. The Rosenblatt Perceptron
Example of a Two-Input PerceptronThe Perceptron Learning AlgorithmLimitations of the PerceptronCombining Multiple PerceptronsImplementing Perceptrons with Linear AlgebraGeometric Interpretation of the PerceptronUnderstanding the Bias TermConcluding Remarks on the Perceptron
Chapter 2. Gradient-Based Learning
Intuitive Explanation of the Perceptron Learning AlgorithmDerivatives and Optimization ProblemsSolving a Learning Problem with Gradient DescentConstants and Variables in a NetworkAnalytic Explanation of the Perceptron Learning AlgorithmGeometric Description of the Perceptron Learning AlgorithmRevisiting Different Types of Perceptron PlotsUsing a Perceptron to Identify PatternsConcluding Remarks on Gradient-Based Learning
Chapter 3. Sigmoid Neurons and Backpropagation
Modified Neurons to Enable Gradient Descent for Multilevel NetworksWhich Activation Function Should We Use?Function Composition and the Chain RuleUsing Backpropagation to Compute the GradientBackpropagation with Multiple Neurons per LayerProgramming Example: Learning the XOR FunctionNetwork ArchitecturesConcluding Remarks on Backpropagation
Chapter 4. Fully Connected Networks Applied to Multiclass Classification
Introduction to Datasets Used When Training NetworksTraining and InferenceExtending the Network and Learning Algorithm to Do Multiclass ClassificationNetwork for Digit ClassificationLoss Function for Multiclass ClassificationProgramming Example: Classifying Handwritten DigitsMini-Batch Gradient DescentConcluding Remarks on Multiclass Classification
Chapter 5. Toward DL: Frameworks and Network Tweaks
Programming Example: Moving to a DL FrameworkThe Problem of Saturated Neurons and Vanishing GradientsInitialization and Normalization Techniques to Avoid Saturated NeuronsCross-Entropy Loss Function to Mitigate Effect of Saturated Output NeuronsDifferent Activation Functions to Avoid Vanishing Gradient in Hidden LayersVariations on Gradient Descent to Improve LearningExperiment: Tweaking Network and Learning ParametersHyperparameter Tuning and Cross-ValidationConcluding Remarks on the Path Toward Deep Learning
Chapter 6. Fully Connected Networks Applied to Regression
Output UnitsThe Boston Housing DatasetProgramming Example: Predicting House Prices with a DNNImproving Generalization with RegularizationExperiment: Deeper and Regularized Models for House Price PredictionConcluding Remarks on Output Units and Regression Problems
Chapter 7. Convolutional Neural Networks Applied to Image Classification
The CIFAR-10 DatasetCharacteristics and Building Blocks for Convolutional LayersCombining Feature Maps into a Convolutional LayerCombining Convolutional and Fully Connected Layers into a NetworkEffects of Sparse Connections and Weight SharingProgramming Example: Image Classification with a Convolutional NetworkConcluding Remarks on Convolutional Networks
Chapter 8. Deeper CNNs and Pretrained Models
VGGNetGoogLeNetResNetProgramming Example: Use a Pretrained ResNet ImplementationTransfer LearningBackpropagation for CNN and PoolingData Augmentation as a Regularization TechniqueMistakes Made by CNNsReducing Parameters with Depthwise Separable ConvolutionsStriking the Right Network Design Balance with EfficientNetConcluding Remarks on Deeper CNNs
Chapter 9. Predicting Time Sequences with Recurrent Neural Networks
Limitations of Feedforward NetworksRecurrent Neural NetworksMathematical Representation of a Recurrent LayerCombining Layers into an RNNAlternative View of RNN and Unrolling in TimeBackpropagation Through TimeProgramming Example: Forecasting Book SalesDataset Considerations for RNNsConcluding Remarks on RNNs
Chapter 10. Long Short-Term Memory
Keeping Gradients HealthyIntroduction to LSTMAlternative View of LSTMRelated Topics: Highway Networks and Skip ConnectionsConcluding Remarks on LSTM
Chapter 11. Text Autocompletion with LSTM and Beam Search
Encoding TextLonger-Term Prediction and Autoregressive ModelsBeam SearchProgramming Example: Using LSTM for Text AutocompletionBidirectional RNNsDifferent Combinations of Input and Output SequencesConcluding Remarks on Text Autocompletion with LSTM
Chapter 12. Neural Language Models and Word Embeddings
Introduction to Language Models and Their Use CasesExamples of Different Language ModelsBenefit of Word Embeddings and Insight into How They WorkWord Embeddings Created by Neural Language ModelsProgramming Example: Neural Language Model and Resulting EmbeddingsKing – Man + Woman = QueenKing – Man + Woman ! = QueenLanguage Models, Word Embeddings, and Human BiasesRelated Topic: Sentiment Analysis of TextConcluding Remarks on Language Models and Word Embeddings
Chapter 13. Word Embeddings from word2vec and GloVe
Using word2vec to Create Word Embeddings Without a Language ModelAdditional Thoughts on word2vecword2vec in Matrix FormWrapping Up word2vecProgramming Example: Exploring Properties of GloVe EmbeddingsConcluding Remarks on word2vec and GloVe
Chapter 14. Sequence-to-Sequence Networks and Natural Language Translation
Encoder-Decoder Model for Sequence-to-Sequence LearningIntroduction to the Keras Functional APIProgramming Example: Neural Machine TranslationExperimental ResultsProperties of the Intermediate RepresentationConcluding Remarks on Language Translation
Chapter 15. Attention and the Transformer
Rationale Behind AttentionAttention in Sequence-to-Sequence NetworksAlternatives to Recurrent NetworksSelf-AttentionMulti-head AttentionThe TransformerConcluding Remarks on the Transformer
Chapter 16. One-to-Many Network for Image Captioning
Extending the Image Captioning Network with AttentionProgramming Example: Attention-Based Image CaptioningConcluding Remarks on Image Captioning
Chapter 17. Medley of Additional Topics
AutoencodersMultimodal LearningMultitask LearningProcess for Tuning a NetworkNeural Architecture SearchConcluding Remarks
Chapter 18. Summary and Next Steps
Things You Should Know by NowEthical AI and Data EthicsThings You Do Not Yet KnowNext Steps
Appendix A. Linear Regression and Linear Classifiers
Linear Regression as a Machine Learning AlgorithmComputing Linear Regression CoefficientsClassification with Logistic RegressionClassifying XOR with a Linear ClassifierClassification with Support Vector MachinesEvaluation Metrics for a Binary Classifier
Appendix B. Object Detection and Segmentation
Object DetectionSemantic SegmentationInstance Segmentation with Mask R-CNN
Appendix C. Word Embeddings Beyond word2vec and GloVe
WordpiecesFastTextCharacter-Based MethodELMoRelated Work
Appendix D. GPT, BERT, and RoBERTa
GPTBERTRoBERTaHistorical Work Leading Up to GPT and BERTOther Models Based on the Transformer
Appendix E. Newton-Raphson versus Gradient Descent
Newton-Raphson Root-Finding MethodRelationship Between Newton-Raphson and Gradient Descent
Appendix F. Matrix Implementation of Digit Classification Network
Single MatrixMini-Batch Implementation
Appendix G. Relating Convolutional Layers to Mathematical Convolution
Appendix H. Gated Recurrent Units
Alternative GRU ImplementationNetwork Based on the GRU
Appendix I. Setting Up a Development Environment
PythonProgramming EnvironmentProgramming ExamplesDatasetsInstalling a DL FrameworkTensorFlow Specific ConsiderationsKey Differences Between PyTorch and TensorFlow
Appendix J. Cheat Sheets
Works Cited
Index
Code Snippets

Content preview from Learning Deep Learning: Theory and Practice of Neural Networks, Computer Vision, NLP, and Transformers using TensorFlow

Chapter 16 One-to-Many Network for Image Captioning

We have now spent a number of chapters on working with textual data. Before that, we looked at how convolutional networks can be applied to image data. In this chapter, we describe how to combine a convolutional network and a recurrent network to build a network that performs image captioning. That is, given an image as input, the network generates a textual description of the image. We then describe how to extend the network with attention. We conclude the chapter with a programming example that implements such an attention-based image-captioning network.

Given that this programming example is the most extensive example in the book and we describe it after we described the Transformer, it ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9780137470198

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Learning Deep Learning: Theory and Practice of Neural Networks, Computer Vision, NLP, and Transformers using TensorFlow

by Magnus Ekman

Chapter 16

One-to-Many Network for Image Captioning

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.