O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Checking the time intervals

Earlier, we mentioned needing to have equally sized time intervals. Additionally, before we perform any time series analysis, we need to check for the number of non-missing time intervals. So, let's check the number of enrollment years for each category.

Using the dplyr package, we can use summarize (n()) to count the number of entries for each category:

# -- summarize and sort by the number of years yr.count <- x2 %>% group_by(cat) %>% summarise(n = n()) %>% arrange(n)# - we can see that there are 14 years for all of the groups.  That is good!print(yr.count, 10)  > Source: local data frame [24 x 2] >  >                      cat     n >                   (fctr) (int) > 1         18 to 24 YEARS    14 > 2         25 to 34 YEARS    14 > 3         35 to 44 YEARS    14 > 4 45 to 54 YEARS 14 ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required