Bivariate analysis finds out the relationship between two variables. In this, we always look for association and disassociation between variables at a predefined significance level. This analysis could be performed for any combination of categorical and continuous variables. The various combinations can be: both the variables categorical, categorical and continuous, and continuous and continuous.
To step through this recipe, you will need a running Spark cluster in any one of the modes, that is, local, standalone, YARN, or Mesos. For installing Spark on a standalone cluster, please refer to http://spark.apache.org/docs/latest/spark-standalone.html. Also, include the Spark MLlib package in the
build.sbt file so that ...