StopWordsRemover

StopWordsRemover is a Transformer that takes a String array of words and returns a String array after removing all the defined stop words. Some examples of stop words are I, you, my, and, or, and so on which are fairly commonly used in the English language. You can override or extend the set of stop words to suit the purpose of the use case. Without this cleansing process, the subsequent algorithms might be biased because of the common words.

In order to invoke StopWordsRemover, you need to import the following package:

import org.apache.spark.ml.feature.StopWordsRemover

First, you need to initialize a StopWordsRemover , specifying the input column and the output column. Here, we are choosing the words column created by the ...

Get Scala and Spark for Big Data Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.