Skip to Content
Python机器学习手册:从数据预处理到深度学习
book

Python机器学习手册:从数据预处理到深度学习

by Chris Albon
July 2019
Intermediate to advanced
365 pages
8h 13m
Chinese
Publishing House of Electronics Industry
Content preview from Python机器学习手册:从数据预处理到深度学习
180
10
使用特征选择进行降维
ANOVA F
值。在根据目标向量对数值型特征分类时,该值可以用来判断每个分类的特
征均值之间的差异有多大。例如,如果有一个二元目标向量(性别)和一个数值型特征(考
试分数),那么
ANOVA F
值可以用来判断男性的平均得分是否与女性的相同。如果相同,
那么考试分数并不能帮助我们预测性别,因此这个特征与目标向量是无关的。
10.5
 递归式特征消除
问题描述
自动选择需要保留的最优特征。
解决方案
使用
scikit-learn
RFECV
类通过交叉验证(
Crossing Validation
CV
)进行递归式特征
消除(
Recursive Feature Elimination
REF
)。该方法会重复训练模型,每一次训练移除
一个特征,直到模型性能(例如精度)变差。剩余的特征就是最优特征
#
加载库
import warnings
from sklearn.datasets import make_regression
from sklearn.feature_selection import RFECV
from sklearn import datasets, linear_model
#
忽略一些烦人但无害的警告信息
warnings.filterwarnings(action="ignore", module="scipy",
message="^internal gelsd")
#
生成特征矩阵、目标向量以及系数
features, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

精通特征工程

精通特征工程

Alice Zheng, Amanda Casari
精通機器學習

精通機器學習

Aurélien Géron
Python数据分析基础

Python数据分析基础

Clinton W. Brownley

Publisher Resources

ISBN: 9787121369629