Skip to Content
精通特征工程
book

精通特征工程

by Alice Zheng, Amanda Casari
April 2019
Intermediate to advanced
172 pages
4h 39m
Chinese
Posts & Telecom Press
Content preview from 精通特征工程
线性建模与线性代数基础
155
“零空间”这个名称听起来像是一种存在性危机的悲伤结局。如果零空间中除了全零向量
之外还有其他向量,那么方程
Aw
=
y
就有无穷多解。有很多解可供选择本身不是一件坏
事,有时候我们可以选择任意一个解。但如果有很多可能的答案,就会有多个特征集合可
以用于分类任务。这时就很难弄清楚哪个特征才是真正重要的。
解决零空间过大这个问题的一种方法是添加额外的限制条件,从而对模型进行
调整
Aw
=
y
其中
w
满足
w
T
w
=
c
这种正则化方式限制了权重向量具有特定的范数
c
。这种正则化的强度是通过一个正则化
参数来控制的,和前面实验中的做法一样,这个参数必须调优。
一般来说,
特征选择
方法需要选取出最有用的特征来降低计算负担,减轻模型的模糊性,
并使得学习出的模型更与众不同。这也是
2.6
节中的重点。
另一个问题是数据矩阵的谱的“不均匀性”。在训练线性分类器时,我们关心的不仅是线
性系统是否有通用解,还有我们是否能容易地找到这个解。通常,训练过程使用的求解器
要计算损失函数的梯度,然后沿着梯度以较小的步长求解。当某些奇异值非常大而其余
奇异值非常接近于
0
时,要想找到真实答案,求解器需要非常小心地绕过较长的奇异向量
(对应于大奇异值的奇异向量),花费大量时间在较短的奇异向量附近进行探索。这种谱的
“不均匀性”可以由矩阵的条件数来表示,也就是最大奇异值和最小奇异值的绝对值之间
的比值。
总结一下,为了找出一个与众不同的好线性模型,也为了比较容易地找到它,我们需要以
下条件。
(1)
标签向量能比较好地通过特征 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

精通機器學習

精通機器學習

Aurélien Géron

Publisher Resources

ISBN: 9787115509680