In this Introduction to PySpark training course, expert author Alex Robbins will teach you everything you need to know about the Spark Python API. This course is designed for users that already have a basic working knowledge of Python.
You will start by learning how to install Spark, then jump into learning the Spark fundamentals. From there, Alex will teach you about transformations, including filter, pipe, repartition, and distinct. This video tutorial also covers actions, input and output, performance, and running on a cluster. Finally, you will learn advanced topics, including Spark streaming, dataframes and SQL, and MLlib.
Once you have completed this computer based training course, you will have learned everything you need to know about PySpark. Working files are included, allowing you to follow along with the author throughout the lessons.
Table of contents
- Installing Spark
- Spark Fundamentals
- Key-Value Pair RDDs
- Input And Output
- Running On A Cluster
- Advanced Spark
- Title: Introduction to PySpark
- Release date: December 2015
- Publisher(s): Infinite Skills
- ISBN: 9781771375535
You might also like
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …
Software Engineering at Google
Today, software engineers need to know not only how to program effectively but also how to …
Hadoop and Spark Fundamentals
9+ Hours of Video Instruction The perfect (and fast) way to get started with Hadoop and …