Skip to Main Content
Python 机器学习实践:测试驱动的开发方法
book

Python 机器学习实践:测试驱动的开发方法

by Matthew Kirk
January 2018
Intermediate to advanced content levelIntermediate to advanced
211 pages
8h 31m
Chinese
China Machine Press
Content preview from Python 机器学习实践:测试驱动的开发方法
聚类
163
9-3:聚类实际上可能比较“软”
算法
EM
聚类算法是一个收敛于集群映射的迭代过程。它在每次迭代中完成两个步骤:期
望和最大化。
但这是什么意思呢?期望和最大化可以代表很多意思。
期望
期望是关于更新模型的真实性并且查看映射情况如何。它是用测试驱动的方法去建立
群集,我们要验证模型跟踪数据的效果如何。从数学上来说,对数据的每一行,我们
都根据它先前的值来估计一个概率向量。
在第一次迭代中假设概率都是均等的(除非有一些领域知识应用到模型之中)。在给
定这些信息的情况下,计算在模型和数据的真实值之间的条件分布中 θ 的对数似然度。
可以记为:
Q θ θ
t
= E
Z X,θ
t
logL θ; X, Z
θ is the probability model we have assigned to rows. Z and X are the distributions for
our cluster mappings and the original data points.
Maximization
Just estimating the log likelihood of something doesn’t solve our problem of assigning
new probabilities to the Z distribution. For that we simply take the argument max of
the expectation function. N
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Mastering Python for Bioinformatics

Mastering Python for Bioinformatics

Ken Youens-Clark

Publisher Resources

ISBN: 9787111581666