The starting point of writing any Spark program is SparkContext (or JavaSparkContext in Java). SparkContext is initialized with an instance of a SparkConf object, which contains various Spark cluster-configuration settings (for example, the URL of the master node).
It is a main entry point for Spark functionality. A SparkContext is a connection to a Spark cluster. It can be used to create RDDs, accumulators, and broadcast variables on the cluster.
Only one SparkContext is active per JVM. You must call stop(), which is the active SparkContext, before creating a new one.
Once initialized, we will use the various methods found in the SparkContext object to create and manipulate distributed datasets and shared variables. ...