Skip to Content
Data Engineering with Databricks Cookbook
book

Data Engineering with Databricks Cookbook

by Pulkit Chadha
May 2024
Beginner to intermediate
438 pages
9h 41m
English
Packt Publishing
Content preview from Data Engineering with Databricks Cookbook

4

Ingesting Streaming Data

Using the Spark SQL engine, Apache Spark Structured Streaming provides a stream processing engine that can handle large-scale and reliable data streams. You can write your streaming computation using the same syntax as a batch computation on static data. The Spark SQL engine will run your computation in an incremental and continuous manner and keep the final result updated as new streaming data arrives. The computation is performed on the same efficient Spark SQL engine. The system also ensures that the computation is fault-tolerant from end to end by using checkpointing and write-ahead logs.

Apache Spark Structured Streaming is favored for real-time data processing due to its high-level, unified API that seamlessly ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Fundamentals of Data Engineering

Fundamentals of Data Engineering

Joe Reis, Matt Housley
Fundamentals of Data Engineering

Fundamentals of Data Engineering

Joe Reis, Matt Housley

Publisher Resources

ISBN: 9781837633357Supplemental Content