O'Reilly logo

Machine Learning with Spark - Second Edition by Nick Pentreath, Manpreet Singh Ghotra, Rajdeep Dua

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Using ML for feature normalization

Spark provides some built-in functions for feature scaling and standardization in its machine learning library. These include StandardScaler, which applies the standard normal transformation, and Normalizer, which applies the same feature vector normalization we showed you in our preceding example code.lization we showed you in our preceding example code.lization, we showed you in our preceding example code.

We will explore the use of these methods in the upcoming chapters, but for now, let's simply compare the results of using MLlib's Normalizer to our own results:

from pyspark.mllib.feature import Normalizer normalizer = Normalizer() vector = sc.parallelize([x])

After importing the required class, we ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required