Understanding the Spark Programming Model

This chapter will progress by explaining the word count problem in Apache Spark using Java and simultaneously setting up an IDE. Then we will progress towards explaining common RDD actions and transformations. We will also touch upon the inbuilt capability of Spark caching and the persistence of RDD for improving Spark job performance.

Get Apache Spark 2.x for Java Developers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.