Skip to Content
Python机器学习手册:从数据预处理到深度学习
book

Python机器学习手册:从数据预处理到深度学习

by Chris Albon
July 2019
Intermediate to advanced
365 pages
8h 13m
Chinese
Publishing House of Electronics Industry
Content preview from Python机器学习手册:从数据预处理到深度学习
291
19
聚类
19.0
 简介
本书的大部分篇幅都在讨论有监督学习,这意味着我们可以获取特征和目标数据。可惜
的是,在现实世界中不总是这样,我们经常会遇到只知道特征的场景。举个例子,假设
我们手头有一家百货商店的销售数据,现在要把这些数据按照购物者是否为折扣俱乐部
的会员分成两类。在这个例子中不可能使用有监督学习,因为我们并没有一个用于训练
和评估模型的目标。尽管如此,我们还有另一个选择 :无监督学习。如果折扣俱乐部的
会员和非会员在百货商店中的行为是完全不同的,那么两个会员行为上的平均差异会比
会员和非会员之间的平均差异小。也就是说,观察值有两个分类(
cluster
2
聚类算法的目标是找出这些观察值潜在的分类,如果做得好的话,我们能在没有目标向
量的情况下预测观察值的分类。聚类算法有很多,它们使用了多种不同的方法来识别数
据中的聚类。在本章中,我们会教大家使用
scikit-learn
实现一些聚类算法,并将其应用
到实践中。
19.1
 使用
K-Means
聚类算法
问题描述
要把观察值聚类为
k
个分类。
2
 编者注 :本章中“分组”“分类”“聚类”几个词含义是相同的,其实指的都是
cluster
。考虑到语言的流
畅性和易读性,并未全部统一表述为“聚类”。
292
19
聚类
解决方案
使用
K-Means
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

精通特征工程

精通特征工程

Alice Zheng, Amanda Casari
精通機器學習

精通機器學習

Aurélien Géron
Python数据分析基础

Python数据分析基础

Clinton W. Brownley

Publisher Resources

ISBN: 9787121369629