O'Reilly logo

R Data Analysis Cookbook - Second Edition by Kuntal Ganguly

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

How to do it...

Income is a numeric variable, and you may want to create a categorical variable from it by creating bins. Suppose you want to label incomes of $10,000 or below as Low, incomes between $10,000 and $31,000 as Medium, and the rest as High. We can do the following:

  1. Create a vector of break points:
> b <- c(-Inf, 10000, 31000, Inf) 
  1. Create a vector of names for break points:
> names <- c("Low", "Medium", "High") 
  1. Cut the vector using the break points:
> students$Income.cat <- cut(students$Income, breaks = b, labels = names) > students     Age State Gender Height Income Income.cat 1   23    NJ      F     61   5000        Low 2   13    NY      M     55   1000        Low 3   36    NJ      M     66   3000        Low 4   31    VA      F     64   4000        Low 5   58    NY      F     70  30000     Medium 6   29    TX      F     63  10000        Low 7 39 NJ M ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required