Classifying text documents using Weka

We used Weka in Chapter 4, Learn from Data - Part 1 to classify data points that are not in text format. Weka is a very useful tool to classify text documents using machine-learning models as well. In this recipe, we will demonstrate how you can use to develop document classification model using Weka 3.

Getting ready

  1. To download Weka, go to http://www.cs.waikato.ac.nz/ml/weka/downloading.html and you will find download options for Windows, Mac, and other operating systems such as Linux. Read through the options carefully and download the appropriate version. During the writing of this book, 3.9.0 was the latest version for the developers, and as the author already had version 1.8 JVM installed in his 64-bit ...

Get Java Data Science Cookbook now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.