Skip to Content
R在数据科学中的应用,第2版
book

R在数据科学中的应用,第2版

by Hadley Wickham, Mine Cetinkaya-Rundel, Garrett Grolemund
May 2025
Intermediate to advanced
578 pages
8h 9m
Chinese
O'Reilly Media, Inc.
Content preview from R在数据科学中的应用,第2版

第 IV 部分. 进口

在本书的这一部分,你将学习如何将更多数据导入 R,以及如何将数据转换成有用的形式以供分析。有时,这只是从适当的数据导入包中调用一个函数的问题。但在更复杂的情况下,可能需要对数据进行 Tidy 和转换,才能得到你想要的整洁矩形。

Our data science model with import highlighted in blue.
图 IV-1. 数据导入是数据科学过程的开始;没有数据,就无法进行数据科学!

在本书的这一部分,你将学习如何访问以下列方式存储的数据:

  • 第 20 章中,你将学习如何从 Excel 电子表格和 Google Sheets 中导入数据。

  • 第 21 章中,你将学习如何将数据从数据库中导出并导入 R(还将学习如何将数据从 R 中导出并导入数据库)。

  • 第 22 章中,你将学习 Arrow,它是处理内存外数据(尤其是以 parquet 格式存储的数据)的强大工具。

  • 第 23 章中,你将学习如何处理分层数据,包括以 JSON 格式存储的数据所产生的深嵌套列表。

  • 第 24 章中,你将学习网络 "搜刮",这是从网页中提取数据的艺术和科学。

有两个重要的 tidyverse 软件包我们不在这里讨论:Haven 和 xml2。如果您要处理 SPSS、Stata 和 SAS 文件中的数据,请查看haven 软件包。如果要处理 XML 数据,请查看xml2 软件包。否则,您需要做一些研究,以找出您需要使用的软件包;谷歌是您在这方面的朋友。

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

R深度学习权威指南

R深度学习权威指南

Posts & Telecom Press, Joshua F. Wiley
AI工程

AI工程

Chip Huyen
Raku学习手册

Raku学习手册

brian d foy

Publisher Resources

ISBN: 9798341657304