Estimators

An estimator is an abstraction of a learning algorithm that fits a model on a dataset.

An estimator implements a fit() method that takes a DataFrame and produces a model. An example of a learning algorithm is LogisticRegression.

In a nutshell, the estimator is: DataFrame =[fit]=> Model.

In the following example, PipelineComponentExample introduces the concepts of transformers and estimators:

import org.apache.spark.ml.classification.LogisticRegression import org.apache.spark.ml.linalg.{Vector, Vectors} import org.apache.spark.ml.param.ParamMap import org.apache.spark.sql.Row import org.utils.StandaloneSpark object PipelineComponentExample {   def main(args: Array[String]): Unit = {  val spark = StandaloneSpark.getSparkInstance() ...

Get Machine Learning with Spark - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.