book

机器学习实战：基于Scikit-Learn、Keras 和TensorFlow （原书第2 版）

by Aurélien Géron

October 2020

Intermediate to advanced

693 pages

16h 26m

Chinese

China Machine Press

Read now

Unlock full access

Content preview from 机器学习实战：基于Scikit-Learn、Keras 和TensorFlow （原书第2 版）

162

第 6 章

决策树

与 SVM 一样，决策树是通用的机器学习算法，可以执行分类和回归任务，甚至多输出

任务。它们是功能强大的算法，能够拟合复杂的数据集。例如，在第 2 章中，你在加州

房屋数据集中训练了 DecisionTreeRegressor 模型，使其完全拟合（实际上是过

拟合）。

决策树也是随机森林的基本组成部分（见第 7 章），它们是当今最强大的机器学习算法

之一。

在本章中，我们将从讨论如何使用决策树进行训练、可视化和做出预测开始。然后，我

们将了解 Scikit-Learn 使用的 CART 训练算法，并将讨论如何对树进行正则化并将其用

于回归任务。最后，我们将讨论决策树的一些局限性。

6.1 训练和可视化决策树

为了理解决策树，让我们建立一个决策树，然后看看它是如何做出预测的。以下代码在

鸢尾花数据集上训练了一个 DecisionTreeClassifier（见第 4 章）：

from sklearn.datasets import

load_iris

from sklearn.tree import

DecisionTreeClassifier

iris = load_iris()

X = iris.data[:, 2:]

# petal length and width

y = iris.target

tree_clf = DecisionTreeClassifier(max_depth=2)

tree_clf.fit(X, y)

要将决策树可视化，首先，使用 export_graphviz() ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

算法技术手册（原书第2 版）

George T.Heineman, Gary Pollice, Stanley Selkow

Go语言编程

威廉·肯尼迪

数据库系统内幕

Alex Petrov

管理Kubernetes

Brendan Burns, Craig Tracey

Publisher Resources

ISBN: 9787111665977