In the same fashion as previously, we'll see the top 10 words Lincoln used. The filter to apply for Abe's addresses is 1861 through 1864:
> sotu_tidy %>% dplyr::filter(year > 1860 & year < 1865) %>% dplyr::count(word, sort = TRUE)# A tibble: 3,562 x 2 word n <chr> <int> 1 congress 81 2 united 81 3 government 75 4 people 70 5 war 65 6 country 62 7 time 51
8 union 50 9 national 4910 public 48# ... with 3,552 more rows
No surprise that war is high on the list with the Civil War during that time period. One way to visualize how the addresses changed and stayed the same is to produce a word cloud for each address. A convenient way to do that is with the qdap package. We first need to filter out Lincoln's speeches from ...