Skip to Content
Deep Learning for Coders with fastai and PyTorch
book

Deep Learning for Coders with fastai and PyTorch

by Jeremy Howard, Sylvain Gugger
July 2020
Intermediate to advanced
621 pages
16h 47m
English
O'Reilly Media, Inc.
Book available
Content preview from Deep Learning for Coders with fastai and PyTorch

Chapter 9. Tabular Modeling Deep Dive

Tabular modeling takes data in the form of a table (like a spreadsheet or CSV). The objective is to predict the value in one column based on the values in the other columns. In this chapter, we will look at not only deep learning, but also more general machine learning techniques like random forests, as they can give better results depending on your problem.

We will look at how we should preprocess and clean the data as well as how to interpret the result of our models after training, but first we will see how we can feed columns that contain categories into a model that expects numbers by using embeddings.

Categorical Embeddings

In tabular data, some columns may contain numerical data, like “age,” while others contain string values, like “sex.” The numerical data can be directly fed to the model (with some optional preprocessing), but the other columns need to be converted to numbers. Since the values in those correspond to different categories, we often call this type of variables categorical variables. The first type are called continuous variables.

Jargon: Continuous and Categorical Variables

Continuous variables are numerical data, such as “age,” that can be directly fed to the model, since you can add and multiply them directly. Categorical variables contain a number of discrete levels, such as “movie ID,” for which addition and multiplication don’t have meaning (even if they’re stored as numbers).

At the end of 2015, the Rossmann ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Build a Large Language Model (From Scratch)

Build a Large Language Model (From Scratch)

Sebastian Raschka
Hands-On Large Language Models

Hands-On Large Language Models

Jay Alammar, Maarten Grootendorst

Publisher Resources

ISBN: 9781492045519Errata PageSupplemental Content