August 2019
Intermediate to advanced
560 pages
13h 41m
English
According to the Spark MLlib guide (see https://spark.apache.org/docs/latest/ml-guide.html), starting from Spark 2.0, the RDD-based APIs in the spark.mllib package will be retired. Users should use the DataFrame-based ML API in the spark.ml package. In this project, we import several classes from this new library to build the anomaly detection model. The following code block shows a few lines from the AnomalyDetection class in the com.example.esanalytics.spark.mllib package:
import org.apache.spark.ml.clustering.KMeansModel;import org.apache.spark.ml.feature.VectorAssembler;import org.apache.spark.ml.clustering.KMeans;
The following diagram helps us to learn about the steps to ...
Read now
Unlock full access