CHAPTER 2Introduction to Data Engineering
As organizations started working with more and more data, they ran into some big challenges—like how to scale their data systems, keep the data clean and reliable, and turn their raw data into something useful for either analytics, business insights, or machine learning initiatives. But there was one common question: How can we actually collect, store, process, and manage all this data efficiently?
In the last chapter, we looked at how data engineering is helping the healthcare industry become more efficient. In this chapter, we’re going to dig deeper into how data engineering really works, what the main building blocks are, and how the systems behind the scenes are put together.
WHAT YOU WOULD LEARN IN THIS CHAPTER:
- The definition of data engineering and its evolution
- Data engineering explained using an oil refinery model
- The role of a data engineer in an organization
- An overview of the data engineering life cycle
- Navigating project requirements and stakeholders, and deliver business value as a data engineer
- The current state and importance of data engineering
Data engineering can be defined in many ways, and these definitions reflect the diverse experiences and viewpoints of various professionals in the industry. This variety in definitions makes sense because data engineering is a complex field with many different aspects.
By weaving these definitions together, we can see some similarities. Data engineering can be defined as the ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access