Learning to tackle and optimize data engineering problems can be challenging due to the many dimensions each problem can take on. At the outset of each new problem, you must think about data discovery, wrangling, ingestion, transformation, and data accountability, which is an umbrella relating to data contracts (strictly defined data definitions), as well as the need to optimize the data ingestion footprint (since data at scale can easily eat into operation costs). There are additional ...
9. A Gentle Introduction to Stream Processing
Get Modern Data Engineering with Apache Spark: A Hands-On Guide for Building Mission-Critical Streaming Applications now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.