Basics of RDD operation
Let's now go through some RDD operational basics. The best way to understand what something does is to look at the documentation so that we can get a rigorous understanding of what a function performs.
The reason why this is very important is that the documentation is the golden source of how a function is defined and what it is designed to be used as. By reading the documentation, we make sure that we are as close to the source as possible in our understanding. The link to the relevant documentation is https://spark.apache.org/docs/latest/rdd-programming-guide.html.
So, let's start with the map function. The map function returns an RDD by applying the f function to each element of this RDD. In other words, it works ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access