In this Introduction to PySpark training course, expert author Alex Robbins will teach you everything you need to know about the Spark Python API. This course is designed for users that already have a basic working knowledge of Python.
You will start by learning how to install Spark, then jump into learning the Spark fundamentals. From there, Alex will teach you about transformations, including filter, pipe, repartition, and distinct. This video tutorial also covers actions, input and output, performance, and running on a cluster. Finally, you will learn advanced topics, including Spark streaming, dataframes and SQL, and MLlib.
Once you have completed this computer based training course, you will have learned everything you need to know about PySpark. Working files are included, allowing you to follow along with the author throughout the lessons.
Table of contents
- Installing Spark
- Spark Fundamentals
- Key-Value Pair RDDs
- Input And Output
- Running On A Cluster
- Advanced Spark
- Title: Introduction to PySpark
- Release date: December 2015
- Publisher(s): Infinite Skills
- ISBN: 9781771375535
You might also like
Analyzing Data Using Spark 2.0 DataFrames With Python
Apache Spark 2.0 has become the gold standard for processing large datasets. This course, designed for …
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
Scala for the Impatient
4+ Hours of Video Instruction Overview In Scala for the Impatient LiveLessons best-selling author and professor …
Apache Spark Streaming with Python and PySpark
Add Spark Streaming to your data science and machine learning Python projects About This Video Create …