Skip to Content
高效R语言编程
book

高效R语言编程

by Colin Gillespie, Robin Lovelace
August 2018
Intermediate to advanced
227 pages
4h 16m
Chinese
China Electric Power Press Ltd.
Content preview from 高效R语言编程
高效输入
/
输出
103
理大数据的方法是采用数据库,该内容将在第
6
章的“使用数据库”一节中
进行讲解。
二进制文件格式
纯文本文件有局限性,即使可靠的
CSV
格式也是“限于表格数据、缺少类型
安全、限制了数值数据精度”(
Eddelbuettel, Stokely, and Ooms 2016
)。一
旦你读入原始数据(例如从纯文本文件)并整理了(见下一章内容),下一
步通常是将其保存供将来使用。整理后立即保存是为了避免每次都从头整理
数据。我们推荐以二进制格式保存整理过的大型数据集,这可减少读写时间
以及文件大小,使你的数据更便于携带
1
不像纯文本文件,人类无法阅读保存成二进制格式的数据。这允许为了节
省空间而进行的数据压缩,但意味着文件与
R
语言相关。类似
Python
或者
LibreOffice Calc
等外部程序很难加载
R
语言的自有的文件格式,例如
.Rds
本节提供了
R
中二进制格式的概述,通过基准测试对二进制文件与上一节介
绍的纯文本
.csv
格式进行了比较。
R
自带的二进制格式:
Rdata
Rds
.Rds
.RData
R
自带的二进制文件格式。这些格式针对速度和压缩率进行
了优化。但它们之间有何差异呢?下面代码演示了它们之间的主要差异:
save(df_co2, file = "extdata/co2.RData")
saveRDS(df_co2, "extdata/co2.Rds")
load("extdata/co2.RData")
df_co2_rds = readRDS("extdata/co2.Rds") ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

数据科学之编程技术:使用R进行数据清理、分析与可视化

数据科学之编程技术:使用R进行数据清理、分析与可视化

迈克尔 弗里曼, 乔尔 罗斯
R数据科学

R数据科学

Hadley Wickham, Garrett Grolemund

Publisher Resources

ISBN: 9787519820855