This section explains the reasoning behind the installation process for Python, Anaconda, and Spark.
- Spark runs on the Java virtual machine (JVM), the Java Software Development Kit (SDK) is a prerequisite installation for Spark to run on an Ubuntu virtual machine.
In order for Spark to run on a local machine or in a cluster, a minimum version of Java 6 is required for installation.
- Ubuntu recommends the sudo apt install method for Java as it ensures that packages downloaded are up to date.
- Please note that if Java is not currently installed, the output in the terminal will show the following message:
The program 'java' can be found in the following packages:* default-jre* gcj-5-jre-headless* openjdk-8-jre-headless