Lincoln's word frequency

In the same fashion as previously, we'll see the top 10 words Lincoln used. The filter to apply for Abe's addresses is 1861 through 1864:

> sotu_tidy %>%    dplyr::filter(year > 1860 & year < 1865) %>%    dplyr::count(word, sort = TRUE)# A tibble: 3,562 x 2   word           n   <chr>      <int> 1 congress      81 2 united        81 3 government    75 4 people        70 5 war           65 6 country       62 7 time          51
 8 union         50 9 national      4910 public        48# ... with 3,552 more rows

No surprise that war is high on the list with the Civil War during that time period. One way to visualize how the addresses changed and stayed the same is to produce a word cloud for each address. A convenient way to do that is with the qdap package. We first need to filter out Lincoln's speeches from ...

Get Mastering Machine Learning with R - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.