Chapter 7. PyTorch

In Chapters 6 and 5, you learned how convolutional and recurrent neural networks worked by implementing them from scratch. Nevertheless, while understanding how they work is necessary, that knowledge alone won’t get them to work on a real-world problem; for that, you need to be able to implement them in a high-performance library. We could devote an entire book to building a high-performance neural network library, but that would be a much different (or simply much longer) book, for a much different audience. Instead, we’ll devote this last chapter to introducing PyTorch, an increasingly popular neural network framework based on automatic differentiation, which we introduced at the beginning of Chapter 6.

As in the rest of the book, we’ll write our code in a way that maps to the mental models of how neural networks work, writing classes for Layers, Trainers, and so on. In doing so, we won’t be writing our code in line with common PyTorch practices, but we’ll include links on the book’s GitHub repo for you to learn more about expressing neural networks the way PyTorch was designed to express them. Before we get there, let’s start by learning the data type at the core of PyTorch that enables its automatic differentiation and thus its ability to express neural network training cleanly: the Tensor.

PyTorch Tensors

In the last chapter, we showed a simple NumberWithGrad accumulate gradients by keeping track of the operations performed on it. This meant that if we ...

Get Deep Learning from Scratch now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.