December 2018
Intermediate to advanced
318 pages
8h 28m
English
One thing that we haven't mentioned is what happens if a word in the email that you're classifying wasn't in your training set. In order to handle this case, we would need to add a smoothing factor. This is best demonstrated in the following modified code, where the smoothing factor, alpha, is added:
#gives the conditional probability p(B_i | A_x) with smoothingdef conditionalWord(word, spam): if spam: return (trainPositive.get(word,0)+alpha)/(float)(positiveTotal+alpha*numWords) return (trainNegative.get(word,0)+alpha)/(float)(negativeTotal+alpha*numWords)
Read now
Unlock full access