Skip to Content
Data Engineering with Apache Spark, Delta Lake, and Lakehouse
book

Data Engineering with Apache Spark, Delta Lake, and Lakehouse

by Manoj Kukreja
October 2021
Beginner to intermediate content levelBeginner to intermediate
480 pages
9h 18m
English
Packt Publishing

Overview

Data Engineering with Apache Spark, Delta Lake, and Lakehouse is a comprehensive guide packed with practical knowledge for building robust and scalable data pipelines. Throughout this book, you will explore the core concepts and applications of Apache Spark and Delta Lake, and learn how to design and implement efficient data engineering workflows using real-world examples.

What this Book will help me do

  • Master the core concepts and components of Apache Spark and Delta Lake.
  • Create scalable and secure data pipelines for efficient data processing.
  • Learn best practices and patterns for building enterprise-grade data lakes.
  • Discover how to operationalize data models into production-ready pipelines.
  • Gain insights into deploying and monitoring data pipelines effectively.

Author(s)

None Kukreja is a seasoned data engineer with over a decade of experience working with big data platforms. He specializes in implementing efficient and scalable data solutions to meet the demands of modern analytics and data science. Writing with clarity and a practical approach, he aims to provide actionable insights that professionals can apply to their projects.

Who is it for?

This book is tailored for aspiring data engineers and data analysts who wish to delve deeper into building scalable data platforms. It is suitable for those with basic knowledge of Python, Spark, and SQL, and seeking to learn Delta Lake and advanced data engineering concepts. Readers should be eager to develop practical skills for tackling real-world data engineering challenges.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Pipelines with Apache Airflow

Data Pipelines with Apache Airflow

Julian de Ruiter, Bas Harenslak
Kubernetes: Up and Running, 3rd Edition

Kubernetes: Up and Running, 3rd Edition

Brendan Burns, Joe Beda, Kelsey Hightower, Lachlan Evenson
Fundamentals of Data Engineering

Fundamentals of Data Engineering

Joe Reis, Matt Housley

Publisher Resources

ISBN: 9781801077743Supplemental Content