Chapter 2

Exploring Data Engineering Pipelines and Infrastructure

In This Chapter

arrow Defining big data

arrow Looking at some sources of big data

arrow Distinguishing between data science and data engineering

arrow Exploring solutions for big data problems

arrow Checking out a real-world data engineering project

There’s a lot of hype around big data these days, but most people don’t really know or understand what it is or how they can use it to improve their lives and livelihoods. This chapter defines the term big data, explains where big data comes from and how it’s used, and outlines the roles that data engineers and data scientists play in the big data ecosystem. In this chapter, I introduce the fundamental big data concepts that you need in order to start generating your own ideas and plans on how to leverage big data and data science to improve your life and business workflow.

Defining Big Data by Its Four Vs

Big data is data that exceeds the processing capacity of conventional database systems because ...

Get Data Science For Dummies now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.