July 2019
Beginner to intermediate
740 pages
16h 52m
English
The scalers in the previous section address the preprocessing of our numeric data, but how can we deal with categorical data? We need to encode the categories into integer values. There are a few options here, depending on what the categories represent. If our category is binary (such as 0/1, True/False, or yes/no), then we will encode these as a single column for both options, where 0 is one option and 1 is the other. We can easily do this with the np.where() function. Let's encode the wine data's kind field as 1 for red and 0 for white:
>>> np.where(wine.kind == 'red', 1, 0)array([0, 0, 0, ..., 1, 1, 1])
This is effectively a column that tells us whether or not the wine is red. Remember, we concatenated the red wines to the ...
Read now
Unlock full access