© Thomas Mailund 2022
T. MailundBeginning Data Science in R 4https://doi.org/10.1007/978-1-4842-8155-0_5

5. Working with Large Data Sets

Thomas Mailund1  
(1)
Aarhus, Denmark
 

The concept of Big Data refers to enormous data sets, sets of sizes where you need data warehouses to store it, where you typically need sophisticated algorithms to handle the data and distributed computations to get anywhere with it. At the very least, we talk many gigabytes of data but also often terabytes or exabytes.

Dealing with Big Data is also part of data science, but it is beyond the scope of this book. This chapter is on large data sets and how to deal with data that slows down your analysis, but it is not about data sets so large that you cannot analyze it on your desktop ...

Get Beginning Data Science in R 4: Data Analysis, Visualization, and Modelling for the Data Scientist now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.