- Go to the UCLA website to download the file:
https://stats.idre.ucla.edu/stat/r/examples/asa/hmohiv.csv
The dataset we used is the actual data that is in the book Applied Survival Analysis: Regression Modeling of Time to Event Data by David W Hosmer and Stanley Lemeshow (1999). The data came from a HMO-HIM+ study and the data contains the following fields:
- Start a new project in IntelliJ or in an IDE of your choice. Make sure the necessary JAR files are included.
- Set up the package location where the program will reside:
package spark.ml.cookbook.chapter5
- Import the necessary packages for SparkSession to gain access to ...