Standard scalar standardizes features of the data set by scaling to unit variance and removing the mean (optionally) using column summary statistics on the samples in the training set.
This process is a very common pre-processing step.
Standardization improves the convergence rate during the optimization process. It also prevents features with large variances from exerting an overly large influence during model training.
StandardScaler class has the following parameters in the constructor:
new StandardScaler(withMean: Boolean, withStd: Boolean)
- withMean: False by default. Centers the data with mean before scaling. It will build a dense output, does not work on sparse input and will raise an exception.
- withStd: True by default. ...