June 2017
Beginner to intermediate
576 pages
15h 22m
English
Based upon these frequencies, we will filter the data to only include a subset of the top categories. We will exclude some of the terms that do not apply to the physical product, such as design, set, and any associated colors:
# Testing OnlineRetail2 <- OnlineRetail OnlineRetail2 <- subset(OnlineRetail, lastword %in% c("BAG", "CASES", "HOLDER", "BOX", "SIGN", "CHRISTMAS", "BOTTLE", "BUNTING", "MUG", "BOWL", "CANDLES", "COVER", "HEART", "MUG", "BOWL"))
Run the table() function again on the results to see the new frequencies:
head(as.data.frame(sort(table(OnlineRetail2$lastword[]), decreasing = TRUE)), 10) > sort(table(OnlineRetail2$lastword[]), decreasing = TRUE) > HOLDER 6792 > BOX 6528 > SIGN 6184 > BAG 5761 ...