What is a resilient distributed dataset?

Alex Robbins guides you through an in-depth look at the Python API for Apache Spark. In this segment, he explores RDDs--the central abstraction in Spark and essential knowledge for anyone working in the system.

By Alex Robbins
May 13, 2016
Stonehenge Stonehenge (source: garethwiscombe)

In “Introduction to PySpark,” Alex Robbins guides you through an in-depth look at the Python API for Apache Spark. Check out the full training video here.

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more
Post topics: Big Data Tools and Pipelines
Share: