Skip to Content
R在数据科学中的应用,第2版
book

R在数据科学中的应用,第2版

by Hadley Wickham, Mine Cetinkaya-Rundel, Garrett Grolemund
May 2025
Intermediate to advanced
578 pages
8h 9m
Chinese
O'Reilly Media, Inc.
Content preview from R在数据科学中的应用,第2版

第 11 章 交流 交流

本作品已使用人工智能进行翻译。欢迎您提供反馈和意见:translation-feedback@oreilly.com

导言

第 10 章中,我们学习了如何使用曲线图作为探索工具。当你绘制探索性曲线图时,你甚至在观察之前就知道曲线图将显示哪些变量。你绘制的每一幅图都是有目的的,可以快速查看,然后继续绘制下一幅图。在大多数分析过程中,您会绘制数十或数百张图,其中大部分都会被立即丢弃。

既然您已经了解了您的数据,您就需要将您的理解传达给其他人。您的受众很可能不具备与您相同的背景知识,也不会深入了解数据。为了帮助他人快速建立良好的数据心智模型,您需要投入大量精力,使您的图表尽可能不言自明。在本章中,您将学习 ggplot2 为此提供的一些工具。

本章主要介绍制作优秀图形所需的工具。我们假设你知道自己想要什么,只需要知道如何去做。因此,我们强烈建议将本章与一本优秀的通用可视化书籍搭配使用。我们特别喜欢阿尔伯特-开罗(Albert Cairo,New Riders)的《真实的艺术》The Truthful Art)。这本书并不传授创建可视化的技巧,而是专注于创建有效图形所需的思考。

先决条件

在本章中,我们将再次关注 ggplot2。我们还将使用一些 dplyr 来进行数据处理;使用scale来覆盖默认的断点、标签、变换和调色板;以及一些 ggplot2 扩展包,包括 Kamil Slowikowski 的ggrepel和 Thomas Lin Pedersen 的patchwork。别忘了,如果您还没有安装这些软件包,您需要用 install.packages()安装这些软件包。

library(tidyverse)
library(scales)
library(ggrepel)
library(patchwork)

标签

要将探索性图形转化为说明性图形,最简单的方法就是贴好标签。您可以使用 labs()功能添加标签:

ggplot(mpg, aes(x = displ, y = hwy)) +
  geom_point(aes(color = class)) +
  geom_smooth(se = FALSE) +
  labs(
    x = "Engine displacement (L)",
    y = "Highway fuel economy (mpg)",
    color = "Car type",
    title = "Fuel efficiency generally decreases with engine size",
    subtitle = "Two seaters (sports cars) are an exception because of their light weight",
    caption = "Data from fueleconomy.gov"
  )
Scatterplot of highway fuel efficiency versus engine size of cars, where points are colored according to the car class. A smooth curve following the trajectory of the relationship between highway fuel efficiency versus engine size of cars is overlaid. The x-axis is labelled "Engine displacement (L)" and the y-axis is labelled "Highway fuel economy (mpg)". The legend is labelled "Car type". The plot is titled "Fuel efficiency generally decreases with engine size". The subtitle is "Two seaters (sports cars) are an exception because of their light weight" and the caption is "Data from fueleconomy.gov".

图表标题的目的是概括主要发现。避免使用仅仅描述图表内容的标题,例如 "发动机排量与燃油经济性的散点图"。

如果您需要添加更多文字,还有另外两个有用的标签:subtitle 在标题下方以较小的字体添加更多细节,caption 在图表右下方添加文字,通常用于描述数据来源。您还可以使用 labs()来替换坐标轴和图例标题。用更详细的描述替换简短的变量名并包含单位通常是个好主意。 ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

R深度学习权威指南

R深度学习权威指南

Posts & Telecom Press, Joshua F. Wiley
AI工程

AI工程

Chip Huyen
Raku学习手册

Raku学习手册

brian d foy

Publisher Resources

ISBN: 9798341657304