A transformer is an abstraction that includes feature transformers and learned models. The transformer implements the transform() method, which converts one DataFrame to another.
A feature transformer takes a DataFrame, reads the text, maps it to a new column, and outputs a new DataFrame.
A learning model takes a DataFrame, reads the column containing feature vectors, predicts the label for each feature vector, and outputs a new DataFrame with the predicted labels.
Custom transformers are required to follow the following steps:
- Implement the transform method.
- Specify inputCol and outputCol.
- Accept DataFrame as input and return DataFrame as output.
In nutshell, the transformer: DataFrame =[transform]=> DataFrame.