Skip to Content
Data Engineering with Databricks Cookbook
book

Data Engineering with Databricks Cookbook

by Pulkit Chadha
May 2024
Beginner to intermediate
438 pages
9h 41m
English
Packt Publishing
Content preview from Data Engineering with Databricks Cookbook

5

Processing Streaming Data

Streaming data is data that is continuously generated and updated in real time, such as sensor readings, weblogs, social media posts, online transactions, and more. Streaming data can provide valuable insights into the current state and trends of various domains, such as e-commerce, finance, health care, gaming, and the Internet of Things (IoT). However, streaming data also poses many challenges for data ingestion and processing, such as scalability, reliability, fault tolerance, latency, and consistency.

Apache Spark is a popular open source framework for large-scale distributed data processing. Apache Spark Structured Streaming is an extension of Spark SQL that enables scalable and fault-tolerant processing of streaming ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Fundamentals of Data Engineering

Fundamentals of Data Engineering

Joe Reis, Matt Housley
Fundamentals of Data Engineering

Fundamentals of Data Engineering

Joe Reis, Matt Housley

Publisher Resources

ISBN: 9781837633357Supplemental Content