Skip to Content
R在数据科学中的应用,第2版
book

R在数据科学中的应用,第2版

by Hadley Wickham, Mine Cetinkaya-Rundel, Garrett Grolemund
May 2025
Intermediate to advanced
578 pages
8h 9m
Chinese
O'Reilly Media, Inc.
Content preview from R在数据科学中的应用,第2版

第 1 章 数据可视化 数据可视化

本作品已使用人工智能进行翻译。欢迎您提供反馈和意见:translation-feedback@oreilly.com

导言

"简单的图表为数据分析师带来的信息比任何其他设备都要多"。约翰-图基

ggplot2 实现了图形语法,这是一个用于描述和构建图形的连贯系统。有了 ggplot2,您只需学习一个系统并将其应用于多个地方,就能更快地完成更多工作。

本章将教你如何使用 ggplot2 将数据可视化。我们将首先创建一个简单的散点图,并用它来介绍美学映射和几何对象--ggplot2 的基本构件。然后,我们将引导您对单一变量的分布进行可视化,以及对两个或多个变量之间的关系进行可视化。最后,我们将介绍保存绘图和故障排除技巧。

先决条件

本章重点介绍 tidyverse 的核心软件包之一 ggplot2。要访问本章使用的数据集、帮助页面和函数,请运行以下命令加载 tidyverse:

library(tidyverse)
#> ── Attaching core tidyverse packages ───────────────────── tidyverse 2.0.0 ──
#> ✔ dplyr     1.1.0.9000     ✔ readr     2.1.4     
#> ✔ forcats   1.0.0          ✔ stringr   1.5.0     
#> ✔ ggplot2   3.4.1          ✔ tibble    3.1.8     
#> ✔ lubridate 1.9.2          ✔ tidyr     1.3.0     
#> ✔ purrr     1.0.1          
#> ── Conflicts ─────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag()    masks stats::lag()
#> ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all 
#>   conflicts to become errors

这一行代码会加载核心 tidyverse,也就是你在几乎所有数据分析中都会用到的软件包。它还会告诉你,tidyverse 中的哪些函数与基本 R 中的函数(或你可能加载的其他软件包中的函数)相冲突。1

如果运行此代码后出现错误信息there is no package called 'tidyverse' ,您需要首先安装它,然后再次运行 library()再次运行

install.packages("tidyverse")
library(tidyverse)

您只需安装一次软件包,但每次启动新会话时都需要加载它。

除 tidyverse 外,我们还将使用 palmerpenguins 软件包,其中包括penguins 数据集(包含帕尔默群岛三个岛屿上企鹅的身体测量数据)和 ggthemes 软件包(提供色盲安全调色板)。

library(palmerpenguins)
library(ggthemes)

第一步

脚蹼长的企鹅比脚蹼短的企鹅重还是轻?你可能已经有了答案,但要尽量准确。鳍长和体重之间的关系是怎样的?是正比关系?负相关?线性关系?非线性?企鹅的种类不同,两者之间的关系也不同吗?企鹅生活的岛屿不同,关系也不同吗?让我们创建可视化来回答这些问题。

企鹅数据框

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

R深度学习权威指南

R深度学习权威指南

Posts & Telecom Press, Joshua F. Wiley
AI工程

AI工程

Chip Huyen
Raku学习手册

Raku学习手册

brian d foy

Publisher Resources

ISBN: 9798341657304