S. HainesModern Data Engineering with Apache Sparkhttps://doi.org/10.1007/978-1-4842-7452-1_7

7. Data Pipelines and Structured Spark Applications

Scott Haines¹

(1)

San Jose, CA, USA

There is a central processing paradigm that exists behind the scenes and can help connect just about everything you build as a data engineer. The processing paradigm is a physical as well as a mental model for effectively moving and processing data, known as the data pipeline . We first touched on the data pipeline in Chapter 1, while introducing the history and common components driving the modern data stack. This chapter will teach you how to write, test, and compile reliable ...

Get Modern Data Engineering with Apache Spark: A Hands-On Guide for Building Mission-Critical Streaming Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Modern Data Engineering with Apache Spark: A Hands-On Guide for Building Mission-Critical Streaming Applications by Scott Haines

7. Data Pipelines and Structured Spark Applications

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly