Skip to Content
Python机器学习基础教程
book

Python机器学习基础教程

by Andreas C. Müller, Sarah Guido
January 2018
Intermediate to advanced
301 pages
8h 54m
Chinese
Posts & Telecom Press
Content preview from Python机器学习基础教程
80
2
Out[88]:
Accuracy on training set: 0.988
Accuracy on test set: 0.972
在这个例子中,增大 C 可以显著改进模型,得到
97.2%
的精度。
6.
优点
缺点和参数
核支持向量机是非常强大的模型,在各种数据集上的表现都很好。
SVM
允许决策边界很
复杂,即使数据只有几个特征。它在低维数据和高维数据(即很少特征和很多特征)上的
表现都很好,但对样本个数的缩放表现不好。在有多达
10 000
个样本的数据上运行
SVM
可能表现良好,但如果数据量达到
100 000
甚至更大,在运行时间和内存使用方面可能会
面临挑战。
SVM
的另一个缺点是,预处理数据和调参都需要非常小心。这也是为什么如今很多应用
中用的都是基于树的模型,比如随机森林或梯度提升(需要很少的预处理,甚至不需要预
处理)。此外,
SVM
模型很难检查,可能很难理解为什么会这么预测,而且也难以将模型
向非专家进行解释。
不过
SVM
仍然是值得尝试的,特别是所有特征的测量单位相似(比如都是像素密度)而
且范围也差不多时。
SVM
的重要参数是正则化参数 C、核的选择以及与核相关的参数。虽然我们主要讲的是
RBF
核,但 scikit-learn 中还有其他选择。
RBF
核只有一个参数 gamma,它是高斯核宽度
的倒数。gamma C 控制的都是模型复杂度,较大的值都对应更为复杂的模型。因此,这
两个参数的设定通常是强烈相关的,应该同时调节。
2.3.8
 神经网络
深度学习
一类被称为神经网络的算法最近以“深度学习”的名字再度流行。虽然深度学习在许多机 ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

数据驱动力:企业数据分析实战

数据驱动力:企业数据分析实战

Carl Anderson
Python应用开发指南

Python应用开发指南

Posts & Telecom Press, Ninad Sathaye
管理Kubernetes

管理Kubernetes

Brendan Burns, Craig Tracey

Publisher Resources

ISBN: 9787115475619