Skip to Main Content
Python 机器学习实践:测试驱动的开发方法
book

Python 机器学习实践:测试驱动的开发方法

by Matthew Kirk
January 2018
Intermediate to advanced content levelIntermediate to advanced
211 pages
8h 31m
Chinese
China Machine Press
Content preview from Python 机器学习实践:测试驱动的开发方法
62
4
cat_totals = self.totals
aggregates = {cat: cat_totals[c]/cat_totals['_all'] for c in self.categories]
for token in Tokenizer.unique_tokenizer(email.body()):
for cat in self.categories:
value = self.training[cat][token]
r = (value+1)/(cat_totals[cat]+1)
aggregates[cat] *= r
return aggregates
This test does the following:
First, it trains the model if it’s not already trained (the
train method handles
this).
For each token of the blob of an email we iterate through all categories and calcu‐
late the probability of that token being within that category. This calculates the
Naive Bayesian score of each without dividing by Z.
Now that we ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Mastering Python for Bioinformatics

Mastering Python for Bioinformatics

Ken Youens-Clark

Publisher Resources

ISBN: 9787111581666