December 2017
Beginner to intermediate
660 pages
15h 31m
English
Creating dummy variables is a method to create separate variable for each category of a categorical variable., Although, the categorical variable contains plenty of information and might show a causal relationship with output variable, it can't be used in the predictive models like linear and logistic regression without any processing.
In our dataset, sex is a categorical variable with two categories that are male and female. We can create two dummy variables out of this, as follows:
dummy_sex=pd.get_dummies(data['sex'],prefix='sex')
The result of this statement is, as follows:

Fig. 2.17: Dummy variable for the sex variable ...