The entry point to SparkR is the SparkSession object, which represents the connection to the Spark cluster. The node on which R is running becomes the driver. Any objects created by the R program reside on this node. At the moment, R cannot be used to manipulate the RDDs of Spark directly. So for all practical purposes, the R API for Spark has access to only Spark SQL abstractions.