June 2018
Intermediate to advanced
436 pages
10h 33m
English
In this sub-section, we will see some basic feature engineering and dataset preparation that can be fed into the MLP classifier. So let's start by creating SparkSession, which is the gateway to access Spark:
SparkSession spark = SparkSession .builder() .master("local[*]") .config("spark.sql.warehouse.dir", "/tmp/spark") .appName("SurvivalPredictionMLP") .getOrCreate();
Then let's read the training set and see a glimpse of it:
Dataset<Row> df = spark.sqlContext() .read() .format("com.databricks.spark.csv") .option("header", "true") .option("inferSchema", "true") .load("data/train.csv");df.show();
A snapshot of the dataset can be seen as follows: