Skip to Content
R在数据科学中的应用,第2版
book

R在数据科学中的应用,第2版

by Hadley Wickham, Mine Cetinkaya-Rundel, Garrett Grolemund
May 2025
Intermediate to advanced
578 pages
8h 9m
Chinese
O'Reilly Media, Inc.
Content preview from R在数据科学中的应用,第2版

第 20 章 电子表格 电子表格

本作品已使用人工智能进行翻译。欢迎您提供反馈和意见:translation-feedback@oreilly.com

导言

第 7 章中,你学习了从.csv.tsv 等纯文本文件导入数据。现在该学习如何从电子表格(Excel 电子表格或 Google Sheet)中获取数据了。本章将以第 7 章所学的大部分内容为基础,但我们还将讨论在处理电子表格数据时的其他注意事项和复杂性。

如果您或您的合作者正在使用电子表格组织数据,我们强烈建议您阅读 Karl Broman 和 Kara Woo 撰写的论文《电子表格中的数据组织》。当您从电子表格中导入数据到 R 中进行分析和可视化时,这篇论文中介绍的最佳实践将为您省去很多麻烦。

在 Excel

Microsoft Excel 是一种广泛使用的电子表格软件程序,数据被组织在电子表格文件内的工作表中。

先决条件

在本节中,您将学习如何使用 readxl 软件包在 R 中加载 Excel 电子表格中的数据。该软件包属于非核心的 tidyverse,因此需要显式加载,但在安装 tidyverse 软件包时会自动安装。稍后,我们还将使用 writexl 软件包,它允许我们创建 Excel 电子表格。

library(readxl)
library(tidyverse)
library(writexl)

入门

readxl 的大多数函数都允许您将 Excel 电子表格加载到 R 中:

  • read_xls()读取XLS 格式的 Excel 文件。
  • read_xlsx()读取XLSX 格式的 Excel 文件。
  • read_excel()可以读取XLSXLSX 两种格式的文件。它会根据输入内容猜测文件类型。

这些函数的语法与我们之前介绍的用于读取其他类型文件的函数类似,例如 read_csv(), read_table()等。在本章接下来的内容中,我们将重点使用 read_excel().

阅读 Excel 电子表格

图 20-1显示了我们要读入 R 的电子表格在 Excel 中的样子。

A look at the students spreadsheet in Excel. The spreadsheet contains information on 6 students, their ID, full name, favorite food, meal plan, and age.
图 20-1. Excel 中名为students.xlsx 的电子表格。

的第一个参数 read_excel()的第一个参数是要读取文件的路径。

students <- read_excel("data/students.xlsx")

read_excel()将以 tibble 形式读入文件。

students
#> # A tibble: 6 × 5
#>   `Student ID` `Full Name`      favourite.food     mealPlan            AGE  
#>          <dbl> <chr>            <chr>              <chr>               <chr>
#> 1            1 Sunil Huffmann   Strawberry yoghurt Lunch only          4    
#> 2            2 Barclay Lynn     French fries       Lunch only          5    
#> 3            3 Jayendra Lyne    N/A                Breakfast and lunch 7    
#> 4            4 Leon Rossini     Anchovies          Lunch only          <NA> 
#> 5 5 Chidiegwu Dunkel Pizza ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

R深度学习权威指南

R深度学习权威指南

Posts & Telecom Press, Joshua F. Wiley
AI工程

AI工程

Chip Huyen
Raku学习手册

Raku学习手册

brian d foy

Publisher Resources

ISBN: 9798341657304