It is interesting to trace the evolution of Apache Spark from an abstract perspective. Spark started out as a fast engine for big data processing-fast to run the code and write code as well. The original value proposition for Spark was that it offered faster in-memory computation graphs with compatibility with the Hadoop ecosystem, plus interesting and very usable APIs in Scala, Java, and Python. RDDs ruled the world. The focus was on iterative and interactive apps that operated on data multiple times, which was not a good use case for Hadoop.
The evolution didn't stop there. As Matei pointed out in his talk at MIT, users wanted more, and the Spark programming model evolved to include the following functionalities: