Skip to Content
Deep Learning for Coders with fastai and PyTorch
book

Deep Learning for Coders with fastai and PyTorch

by Jeremy Howard, Sylvain Gugger
July 2020
Intermediate to advanced
621 pages
16h 47m
English
O'Reilly Media, Inc.
Content preview from Deep Learning for Coders with fastai and PyTorch

Chapter 12. A Language Model from Scratch

We’re now ready to go deep…deep into deep learning! You already learned how to train a basic neural network, but how do you go from there to creating state-of-the-art models? In this part of the book, we’re going to uncover all of the mysteries, starting with language models.

You saw in Chapter 10 how to fine-tune a pretrained language model to build a text classifier. In this chapter, we will explain exactly what is inside that model and what an RNN is. First, let’s gather some data that will allow us to quickly prototype our various models.

The Data

Whenever we start working on a new problem, we always first try to think of the simplest dataset we can that will allow us to try out methods quickly and easily, and interpret the results. When we started working on language modeling a few years ago, we didn’t find any datasets that would allow for quick prototyping, so we made one. We call it Human Numbers, and it simply contains the first 10,000 numbers written out in English.

Jeremy Says

One of the most common practical mistakes I see even among highly experienced practitioners is failing to use appropriate datasets at appropriate times during the analysis process. In particular, most people tend to start with datasets that are too big and too complicated.

We can download, extract, and take a look at our dataset in the usual way:

from fastai.text.all import *
path = untar_data(URLs.HUMAN_NUMBERS)
path.ls()
(#2) [Path('train.txt'),Path('valid.txt')] ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Build a Large Language Model (From Scratch)

Build a Large Language Model (From Scratch)

Sebastian Raschka

Publisher Resources

ISBN: 9781492045519Errata Page