O'Reilly logo

Machine Learning with Spark - Second Edition by Nick Pentreath, Manpreet Singh Ghotra, Rajdeep Dua

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Matrix in Spark

A local matrix in Spark has integer-typed row and column indices. Values are double-typed. All the values are stored on a single machine. MLlib supports the following matrix types:

  • Dense matrices: Matrices where entry values stored are in a single, double array in a column-major order.
  • Sparse matrices: Matrices where non-zero entry values are stored in the CSC format in a column-major order. For example, the following dense matrix is stored in a one-dimensional array [2.0, 3.0, 4.0, 1.0, 4.0, 5.0] for the matrix size (3, 2):
2.0 3.0
4.0 1.0
4.0 5.0

This is an example of a dense and sparse matrix:

       val dMatrix: Matrix = Matrices.dense(2, 2, Array(1.0, 2.0, 3.0,           4.0))         println("dMatrix: n" + dMatrix)  val sMatrixOne: Matrix ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required