Estimators

An estimator is an abstraction of a learning algorithm that fits a model on a dataset.

An estimator implements a fit() method that takes a DataFrame and produces a model. An example of a learning algorithm is LogisticRegression.

In a nutshell, the estimator is: DataFrame =[fit]=> Model.

In the following example, PipelineComponentExample introduces the concepts of transformers and estimators:

import org.apache.spark.ml.classification.LogisticRegression import org.apache.spark.ml.linalg.{Vector, Vectors} import org.apache.spark.ml.param.ParamMap import org.apache.spark.sql.Row import org.utils.StandaloneSpark object PipelineComponentExample {   def main(args: Array[String]): Unit = {  val spark = StandaloneSpark.getSparkInstance() ...

Get Machine Learning with Spark - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.