Skip to Content
Python机器学习手册:从数据预处理到深度学习
book

Python机器学习手册:从数据预处理到深度学习

by Chris Albon
July 2019
Intermediate to advanced
365 pages
8h 13m
Chinese
Publishing House of Electronics Industry
Content preview from Python机器学习手册:从数据预处理到深度学习
15.4
 创建一个基于半径的最近邻分类器
261
讨论
k
值的大小对
KNN
分类器的性能是有重要影响的。在机器学习中,我们一直尝试在偏差
bias
)和方差(
variance
)之间找到一种平衡,而
k
值对这种平衡的影响很明显。如果
k
=
n
(这里
n
是观察值的数量),那么偏差就会很大而方差很小。如果
k
= 1
,那么偏差会
很小,但是方差很大。只有找到了能在偏差和方差之间取得折中的
k
值,才能得到最佳
KNN
分类器。在解决方案中,我们用
GridSearchCV
对不同
k
值的
KNN
分类器做
5
折交叉验证。当这个过程结束时,就可以得到能产生最佳
KNN
分类器的
k
值:
#
最佳邻域的大小
(k)
classifier.best_estimator_.get_params()["knn__n_neighbors"]
6
15.4
 创建一个基于半径的最近邻分类器
问题描述
对于分类未知的观察值,根据一定距离范围内所有观察值的分类来确定其分类。
解决方案
使用
RadiusNeighborsClassier
#
加载库
from sklearn.neighbors import RadiusNeighborsClassifier
from sklearn.preprocessing import StandardScaler
from sklearn import datasets
#
加载数据
iris = datasets.load_iris()
features = iris.data
target = iris.target ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

精通特征工程

精通特征工程

Alice Zheng, Amanda Casari
精通機器學習

精通機器學習

Aurélien Géron
Python数据分析基础

Python数据分析基础

Clinton W. Brownley

Publisher Resources

ISBN: 9787121369629