O'Reilly logo

R Data Analysis Cookbook - Second Edition by Kuntal Ganguly

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

How it works...

In this recipe, we demonstrate how to use K-means clustering to cluster the protein intake data and  we use k=4 to set the number of clusters to be created as 4. Then, the output of a fitted model shows the size of each cluster, the cluster means of four generated clusters, the cluster vectors with regard to each data point, the within-cluster sum of squares by the clusters, and other available components.

Before visualizing the clusters, we found that the data contains more than two variables (namely, multidimension) and we don't know what variables to choose for the xy coordinates of the scatter plot. A general solution in this scenario would be to perform PCA (discussed in the final recipe of this chapter) and plot data ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required