O'Reilly logo

Machine Learning with Spark - Second Edition by Nick Pentreath, Manpreet Singh Ghotra, Rajdeep Dua

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Vectors in Spark

Spark MLlib uses Breeze and JBlas for internal linear algebraic operations. It uses its own class to represent a vector defined using the org.apache.spark.mllib.linalg.Vector factory. A local vector has integer-typed and 0-based indices. Its values are stored as double-typed. A local vector is stored on a single machine, and cannot be distributed. Spark MLlib supports two types of local vectors, dense and sparse, created using factory methods.

The following code snippet shows how to create basic sparse and dense vectors in Spark:

val dVectorOne: Vector = Vectors.dense(1.0, 0.0, 2.0) println("dVectorOne:" + dVectorOne) //  Sparse vector (1.0, 0.0, 2.0, 3.0) // corresponding to nonzero entries. val sVectorOne: Vector = Vectors.sparse(4, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required