Another way to gain a better understanding of the descriptions is to use topic modeling. We learned about this text mining and machine learning algorithm in Chapter 3, Topic Modeling – Changing Concerns in the State of the Union Addresses. In this case, we'll see if we can use it to create topics over these descriptions and to pull out the differences, trends, and patterns from this set of texts.
First, we'll create a new namespace to handle our topic modeling. We'll use the
src/ufo_data/tm.clj file. The following is the namespace declaration for it:
(ns ufo-data.tm (:require [clojure.java.io :as io] [clojure.string :as str] [clojure.pprint :as pp]) (:import [cc.mallet.util.*] [cc.mallet.types InstanceList] [cc.mallet.pipe ...