January 2018
Intermediate to advanced
470 pages
11h 9m
English
We can also ask which instances were considered outliers or anomalies within our test data. Based on the autoencoder model that was trained before, the input data will be reconstructed, and for each instance, the MSE between actual value and reconstruction is calculated. I am also calculating the mean MSE for both class labels:
test_dim_score.add("Class", test.vec("Class"))val testDF = asDataFrame(test_dim_score).rdd.zipWithIndex.map(r => Row.fromSeq(r._1.toSeq :+ r._2))val schema = StructType(Array(StructField("Reconstruction-MSE", DoubleType, nullable = false), StructField("Class", ByteType, nullable = false), StructField("idRow", LongType, nullable = false)))val dffd = spark.createDataFrame(testDF, schema)dffd.show() ...Read now
Unlock full access