June 2017
Beginner to intermediate
576 pages
15h 22m
English
A dummy variable is a binary flag (0 or 1) that designates the presence or absence of a feature. If you are using dummy variables, you will need to accommodate for as many levels as there are of the dummy variable, minus 1. For example, if you have a category designating humidity with only two levels, such as High and Low, you only need to create one dummy variable. Let's say it is called is.humid. If the value of humidity is High, is.humid=1. If humidity is Low, is.humid=0. However, many predictive analytics functions handle the creation of a dummy variable internally, so there is not as much use for coding dummy variables manually as there used to be. But you still may want to create flags that designate the levels ...