book

机器学习实战：基于Scikit-Learn、Keras 和TensorFlow （原书第2 版）

by Aurélien Géron

October 2020

Intermediate to advanced

693 pages

16h 26m

Chinese

China Machine Press

Read now

Unlock full access

Content preview from 机器学习实战：基于Scikit-Learn、Keras 和TensorFlow （原书第2 版）

第 3 章

分类

第 1 章提到，最常见的有监督学习任务包括回归任务（预测值）和分类任务（预测类）。

第 2 章探讨了一个回归任务

—

预测住房价格，用到了线性回归、决策树以及随机森林

等各种算法（我们将会在后续章节中进一步讲解这些算法）。本章中我们将把注意力转

向分类系统。

3.1 MNIST

本章将使用 MNIST 数据集，这是一组由美国高中生和人口调查局员工手写的 70 000 个

数字的图片。每张图片都用其代表的数字标记。这个数据集被广为使用，因此也被称作

是机器学习领域的“ Hello World”：但凡有人想到了一个新的分类算法，都会想看看在

MNIST 上的执行结果。因此只要是学习机器学习的人，早晚都要面对 MNIST。

Scikit-Learn 提供了许多助手功能来帮助你下载流行的数据集。MNIST 也是其中之一。

下面是获取 MNIST 数据集的代码

注 1

：

>>> from sklearn.datasets import

fetch_openml

>>>

mnist = fetch_openml('mnist_784', version=1)

>>>

mnist.keys()

dict_keys(['data', 'target', 'feature_names', 'DESCR', 'details',

'categories', 'url'])

Scikit-Learn 加载的数据集通常具有类似的字典结构，包括：

•

DESCR 键，描述数据集。

•

data 键，包含一个数组，每个实例为一行，每个特征为一列。 ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

算法技术手册（原书第2 版）

George T.Heineman, Gary Pollice, Stanley Selkow

Go语言编程

威廉·肯尼迪

数据库系统内幕

Alex Petrov

管理Kubernetes

Brendan Burns, Craig Tracey

Publisher Resources

ISBN: 9787111665977