Feature vectors from images

The first challenge for us is to convert images into numerical feature vectors in order to train our k-means clustering model. In our case, we will be using grayscale MRI scans. A grayscale image in general can be thought of as a matrix of pixel-intensity values between 0 (black) and 1 (white), as illustrated in Figure 5.3:

Figure 5.3: Grayscale image mapped to a matrix of pixel-intensity values

The dimensions of the resulting matrix is equal to the height (m) and width (n) of the original image in pixels. The input into our k-means clustering model will therefore be (m x n) observations across one independent variable—the ...

Get Machine Learning with Apache Spark Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.