January 2019
Beginner to intermediate
154 pages
4h 31m
English
Narrow transformations transform data without any shuffle involved. These transformations transform the data on a per-partition basis; that is to say, each element of the output RDD can be computed without involving any elements from different partitions. This leads to an important point: The new RDD will always have the same number of partitions as its parent RDDs, and that's why they are easy to recompute in the case of failure. Let's understand this with the following example:

So, we have an RDD-A and we perform a narrow transformation, such as map() or filter(), and we get a new RDD-B with ...
Read now
Unlock full access