January 2019
Intermediate to advanced
378 pages
8h 27m
English
In the last section, we created a function to examine the common n-grams that are found in the headlines of our stories. Now, let's apply that to explore the full content of our stories.
We'll start by exploring bi-grams with the stop words removed. Since headlines are so short compared to the body of the stories, it makes sense to look at them with the stop words intact, although within the story, it typically makes sense to eliminate them:
hw,hl = get_word_stats(dfc['text'], 2, 1) hw
This generates the following output:

Interestingly, we can see that the frivolity we saw in the headlines has completely disappeared. ...
Read now
Unlock full access