Skip to Content
Apache Spark 3 for Data Engineering and Analytics with Python
on-demand course

Apache Spark 3 for Data Engineering and Analytics with Python

with David Mngadi
August 2021
Beginner to intermediate
8h 30m
English
Packt Publishing
Closed Captioning available in English

Overview

In this 8-hour course, you will explore the fundamentals of Apache Spark 3 using Python for data engineering and analytics. From learning the essentials of PySpark to applying it in Databricks for creating powerful data solutions, this course guides you towards mastering data processing and analysis at scale.

What I will be able to do after this course

  • Understand and utilize Spark's structured and RDD APIs for data transformations and actions.
  • Set up and configure your own local PySpark environment for effective Spark development.
  • Grasp concepts of Spark execution and its Directed Acyclic Graphs (DAG).
  • Learn to use Spark SQL and DataFrames for data manipulation and queries.
  • Develop dashboards and visualizations on Databricks for insightful analytics.

Course Instructor(s)

David Mngadi is an experienced data engineer and instructor, proficient in working with technologies like Python and Apache Spark. With a passion for teaching, he constructs engaging, hands-on courses that empower learners to achieve confidence in data analysis. David's approachable style ensures principles are understood clearly and practically.

Who is it for?

This course is ideal for Python developers seeking to branch into data engineering and analytics with PySpark. Aspiring data professionals and analysts with foundational programming knowledge will benefit greatly. Data scientists looking to scale their analysis for big data applications are also welcome. Enthusiasts in engineering tasks over distributed systems will find it engaging and rewarding.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Apache Spark with Python - Big Data with PySpark and Spark

Apache Spark with Python - Big Data with PySpark and Spark

James Lee

Publisher Resources

ISBN: 9781803244303Supplemental Content