Spark transformations are basically the operations that take an RDD as an input and produce one or more RDD as output. All transformations are lazy in nature, while the logical execution plans in the form of direction acyclic graph /DAGs are built actual execution happens only when an action is called.
The transformations can be qualified as narrow transformations and wide transformations.
Narrow transformations |
Wide transformations |
Narrow transformations are where data from a single partition in child RDD is computed using data from a single partition of parent RDD. The examples are map(), filter(). |
Wide transformations are where records in a single partition in child RDD can be computed using data across parent ... |