Chapter 8. Text Mining and Social Network Analysis
In this chapter, we will cover the following recipes:
Creating a categorized corpus
Tokenizing news articles in sentences and words
Stemming, lemmatizing, filtering, and TF-IDF scores
Recognizing named entities
Extracting topics with non-negative matrix factorization
Implementing a basic terms database
Computing social network density
Calculating social network closeness centrality
Determining the betweenness centrality
Estimating the average clustering coefficient
Calculating the assortativity coefficient of a graph
Getting the clique number of a graph
Creating a document graph with cosine similarity
Introduction
Humans have communicated through language for thousands of years. Handwritten texts have been around ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month, and much more.