Chapter 2
Exploring Data Engineering Pipelines and Infrastructure
IN THIS CHAPTER
Defining big data
Looking at some sources of big data
Distinguishing between data science and data engineering
Hammering down on Hadoop
Exploring solutions for big data problems
Checking out a real-world data engineering project
There’s a lot of hype around big data these days, but most people don’t really know or understand what it is or how they can use it to improve their lives and livelihoods. This chapter defines the term big data, explains where big data comes from and how it’s used, and outlines the roles that data engineers and data scientists play in the big data ecosystem. In this chapter, I introduce the fundamental big data concepts that you need in order to start generating your own ideas and plans on how to leverage big data and data science to improve your lifestyle and business workflow ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access