Dummy variables

Dummy variables are used when we are hoping to convert a categorical feature into a quantitative one. Remember that we have two types of categorical features: nominal and ordinal. Ordinal features have natural order among them, while nominal data does not.

Encoding qualitative (nominal) data using separate columns is called making dummy variables and it works by turning each unique category of a nominal column into its own column that is either true or false.

For example, if we had a column for someone's college major and we wished to plug that information into a linear or logistic regression, we couldn't because they only take in numbers! So, for each row, we had new columns that represent the single nominal column. In this case, ...

Get Principles of Data Science now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.