In this Introduction to PySpark training course, expert author Alex Robbins will teach you everything you need to know about the Spark Python API. This course is designed for users that already have a basic working knowledge of Python.
You will start by learning how to install Spark, then jump into learning the Spark fundamentals. From there, Alex will teach you about transformations, including filter, pipe, repartition, and distinct. This video tutorial also covers actions, input and output, performance, and running on a cluster. Finally, you will learn advanced topics, including Spark streaming, dataframes and SQL, and MLlib.
Once you have completed this computer based training course, you will have learned everything you need to know about PySpark. Working files are included, allowing you to follow along with the author throughout the lessons.
Table of Contents
- Installing Spark
- Spark Fundamentals
- Key-Value Pair RDDs
- Input And Output
- Running On A Cluster
- Advanced Spark
- Title: Introduction to PySpark
- Release date: December 2015
- Publisher(s): Infinite Skills
- ISBN: 9781771375535