July 2017
Intermediate to advanced
796 pages
18h 55m
English
When an RDD can be derived from another RDD using a simple one-to-one transformation such as a filter() function, map() function, flatMap() function, and so on, then the child RDD is said to depend on the parent RDD on a one-to-one basis. This dependency is known as narrow dependency as the data can be transformed on the same node as the one containing the original RDD/parent RDD partition without requiring any data transfer over the wire between other executors.
The following diagram is an illustration of how a narrow dependency transforms one RDD to another RDD, applying one-to-one transformation on the RDD elements:
Read now
Unlock full access