The concept of Big Data refers to very large datasets , sets of sizes where you need data warehouses to store the data, where you typically need sophisticated algorithms to handle the data, and distributed computations to get anywhere with it. At the very least, we talk many gigabytes of data but also are often dealing with terabytes or exabytes.
Dealing with Big Data is also part of data science, but it is beyond the scope of this book. This chapter is on large datasets and how to deal with data that slows down your analysis, but it is not about datasets so large that you cannot analyze them ...