Detecting Text Message Spam
University of British Columbia, Sauder School of Business, Canada
CSV -Comma-separated values
SMS -Short Message Service
UTF -Universal Character Set Transformation Format
This chapter is about text classification. Text classification is an important topic in data mining, as most communications are stored in text format. We will build a RapidMiner process that learns the difference between spam messages, and messages that you actually want to read. We will then apply the learned model to new messages to decide whether or not they are spam. Spam is a topic familiar to many, so it is a natural medium to work in. The same techniques used to classify spam messages can be ...