December 2018
Beginner to intermediate
682 pages
18h 1m
English
Naive Bayes classifier has been developed using the SMS spam collection data available at http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/. In this chapter, various techniques available in NLP techniques have been discussed to preprocess prior to build the Naive Bayes model:
>>> import csv
>>> smsdata = open('SMSSpamCollection.txt','r')
>>> csv_reader = csv.reader(smsdata,delimiter='\t')
The following sys package lines code can be used in case of any utf-8 errors encountered while using older versions of Python, or else does not necessary with the latest version of Python 3.6:
>>> import sys
>>> reload (sys)
>>> sys.setdefaultendocing('utf-8')
Normal coding starts from here as usual:
>>> ...