O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Binning character data

Character data is usually grouped according to some sort of hierarchy. But occasionally you will want to group it based upon a text pattern contained in the actual string. Here is an example of binning character data based upon the year (the first four characters of cats):

cats <- as.factor(c('2016-1','2016-2','2016-3')) sales <- c(10,20,30) x <- cbind.data.frame(cats,sales) x str(x) binned <- x binned levels(binned$cats) <- substring(levels(binned$cats), 1, 4) binned > cats <- as.factor(c('2016-1','2016-2','2016-3')) > sales <- c(10,20,30) > x <- cbind.data.frame(cats,sales) > x     cats sales 1 2016-1    10 2 2016-2    20 3 2016-3    30 > str(x) 'data.frame':  3 obs. of  2 variables:  $ cats : Factor w/ 3 levels "2016-1","2016-2",..: ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required