book

Deep Learning for Coders with fastai and PyTorch

by Jeremy Howard, Sylvain Gugger

July 2020

Intermediate to advanced

621 pages

16h 47m

English

O'Reilly Media, Inc.

Book available

Read now

Unlock full access

Who This Book Is ForWhat You Need to KnowWhat You Will LearnO’Reilly Online LearningHow to Contact Us
Deep Learning Is for EveryoneNeural Networks: A Brief HistoryWho We AreHow to Learn Deep LearningYour Projects and Your MindsetThe Software: PyTorch, fastai, and Jupyter (And Why It Doesn’t Matter)Your First ModelGetting a GPU Deep Learning ServerRunning Your First NotebookWhat Is Machine Learning?What Is a Neural Network?A Bit of Deep Learning JargonLimitations Inherent to Machine LearningHow Our Image Recognizer WorksWhat Our Image Recognizer LearnedImage Recognizers Can Tackle Non-Image TasksJargon RecapDeep Learning Is Not Just for Image ClassificationValidation Sets and Test SetsUse Judgment in Defining Test SetsA Choose Your Own Adventure MomentQuestionnaireFurther Research
The Practice of Deep LearningStarting Your ProjectThe State of Deep LearningThe Drivetrain ApproachGathering DataFrom Data to DataLoadersData AugmentationTraining Your Model, and Using It to Clean Your DataTurning Your Model into an Online ApplicationUsing the Model for InferenceCreating a Notebook App from the ModelTurning Your Notebook into a Real AppDeploying Your AppHow to Avoid DisasterUnforeseen Consequences and Feedback LoopsGet Writing!QuestionnaireFurther Research
Key Examples for Data EthicsBugs and Recourse: Buggy Algorithm Used for Healthcare BenefitsFeedback Loops: YouTube’s Recommendation SystemBias: Professor Latanya Sweeney “Arrested”Why Does This Matter?Integrating Machine Learning with Product DesignTopics in Data EthicsRecourse and AccountabilityFeedback LoopsBiasDisinformationIdentifying and Addressing Ethical IssuesAnalyze a Project You Are Working OnProcesses to ImplementThe Power of DiversityFairness, Accountability, and TransparencyRole of PolicyThe Effectiveness of RegulationRights and PolicyCars: A Historical PrecedentConclusionQuestionnaireFurther ResearchDeep Learning in Practice: That’s a Wrap!
Pixels: The Foundations of Computer VisionFirst Try: Pixel SimilarityNumPy Arrays and PyTorch TensorsComputing Metrics Using BroadcastingStochastic Gradient DescentCalculating GradientsStepping with a Learning RateAn End-to-End SGD ExampleSummarizing Gradient DescentThe MNIST Loss FunctionSigmoidSGD and Mini-BatchesPutting It All TogetherCreating an OptimizerAdding a NonlinearityGoing DeeperJargon RecapQuestionnaireFurther Research
From Dogs and Cats to Pet BreedsPresizingChecking and Debugging a DataBlockCross-Entropy LossViewing Activations and LabelsSoftmaxLog LikelihoodTaking the logModel InterpretationImproving Our ModelThe Learning Rate FinderUnfreezing and Transfer LearningDiscriminative Learning RatesSelecting the Number of EpochsDeeper ArchitecturesConclusionQuestionnaireFurther Research
Multi-Label ClassificationThe DataConstructing a DataBlockBinary Cross EntropyRegressionAssembling the DataTraining a ModelConclusionQuestionnaireFurther Research

ImagenetteNormalizationProgressive ResizingTest Time AugmentationMixupLabel SmoothingConclusionQuestionnaireFurther Research
A First Look at the DataLearning the Latent FactorsCreating the DataLoadersCollaborative Filtering from ScratchWeight DecayCreating Our Own Embedding ModuleInterpreting Embeddings and BiasesUsing fastai.collabEmbedding DistanceBootstrapping a Collaborative Filtering ModelDeep Learning for Collaborative FilteringConclusionQuestionnaireFurther Research
Categorical EmbeddingsBeyond Deep LearningThe DatasetKaggle CompetitionsLook at the DataDecision TreesHandling DatesUsing TabularPandas and TabularProcCreating the Decision TreeCategorical VariablesRandom ForestsCreating a Random ForestOut-of-Bag ErrorModel InterpretationTree Variance for Prediction ConfidenceFeature ImportanceRemoving Low-Importance VariablesRemoving Redundant FeaturesPartial DependenceData LeakageTree InterpreterExtrapolation and Neural NetworksThe Extrapolation ProblemFinding Out-of-Domain DataUsing a Neural NetworkEnsemblingBoostingCombining Embeddings with Other MethodsConclusionQuestionnaireFurther Research
Text PreprocessingTokenizationWord Tokenization with fastaiSubword TokenizationNumericalization with fastaiPutting Our Texts into Batches for a Language ModelTraining a Text ClassifierLanguage Model Using DataBlockFine-Tuning the Language ModelSaving and Loading ModelsText GenerationCreating the Classifier DataLoadersFine-Tuning the ClassifierDisinformation and Language ModelsConclusionQuestionnaireFurther Research
Going Deeper into fastai’s Layered APITransformsWriting Your Own TransformPipelineTfmdLists and Datasets: Transformed CollectionsTfmdListsDatasetsApplying the Mid-Level Data API: SiamesePairConclusionQuestionnaireFurther ResearchUnderstanding fastai’s Applications: Wrap Up
The DataOur First Language Model from ScratchOur Language Model in PyTorchOur First Recurrent Neural NetworkImproving the RNNMaintaining the State of an RNNCreating More SignalMultilayer RNNsThe ModelExploding or Disappearing ActivationsLSTMBuilding an LSTM from ScratchTraining a Language Model Using LSTMsRegularizing an LSTMDropoutActivation Regularization and Temporal Activation RegularizationTraining a Weight-Tied Regularized LSTMConclusionQuestionnaireFurther Research
The Magic of ConvolutionsMapping a Convolutional KernelConvolutions in PyTorchStrides and PaddingUnderstanding the Convolution EquationsOur First Convolutional Neural NetworkCreating the CNNUnderstanding Convolution ArithmeticReceptive FieldsA Note About TwitterColor ImagesImproving Training StabilityA Simple BaselineIncrease Batch Size1cycle TrainingBatch NormalizationConclusionQuestionnaireFurther Research
Going Back to ImagenetteBuilding a Modern CNN: ResNetSkip ConnectionsA State-of-the-Art ResNetBottleneck LayersConclusionQuestionnaireFurther Research
Computer Visioncnn_learnerunet_learnerA Siamese NetworkNatural Language ProcessingTabularConclusionQuestionnaireFurther Research
Establishing a BaselineA Generic OptimizerMomentumRMSPropAdamDecoupled Weight DecayCallbacksCreating a CallbackCallback Ordering and ExceptionsConclusionQuestionnaireFurther ResearchFoundations of Deep Learning: Wrap Up
Building a Neural Net Layer from ScratchModeling a NeuronMatrix Multiplication from ScratchElementwise ArithmeticBroadcastingEinstein SummationThe Forward and Backward PassesDefining and Initializing a LayerGradients and the Backward PassRefactoring the ModelGoing to PyTorchConclusionQuestionnaireFurther Research
CAM and HooksGradient CAMConclusionQuestionnaireFurther Research
DataDatasetModule and ParameterSimple CNNLossLearnerCallbacksScheduling the Learning RateConclusionQuestionnaireFurther Research
Blogging with GitHub PagesCreating the RepositorySetting Up Your Home PageCreating PostsSynchronizing GitHub and Your ComputerJupyter for Blogging
Data ScientistsStrategyDataAnalyticsImplementationMaintenanceConstraints

Content preview from Deep Learning for Coders with fastai and PyTorch

Chapter 6. Other Computer Vision Problems

In the previous chapter, you learned some important practical techniques for training models in practice. Considerations like selecting learning rates and the number of epochs are very important to getting good results.

In this chapter, we are going to look at two other types of computer vision problems: multi-label classification and regression. The first one occurs when you want to predict more than one label per image (or sometimes none at all), and the second occurs when your labels are one or several numbers—a quantity instead of a category.

In the process, we will study more deeply the output activations, targets, and loss functions in deep learning models.

Multi-Label Classification

Multi-label classification refers to the problem of identifying the categories of objects in images that may not contain exactly one type of object. There may be more than one kind of object, or there may be no objects at all in the classes you are looking for.

For instance, this would have been a great approach for our bear classifier. One problem with the bear classifier that we rolled out in Chapter 2 was that if a user uploaded something that wasn’t any kind of bear, the model would still say it was either a grizzly, black, or teddy bear—it had no ability to predict “not a bear at all.” In fact, after we have completed this chapter, it would be a great exercise for you to go back to your image classifier application and try to retrain it using the ...