K-means cluster analysis of the Old Faithful data

Since charting suggests a two-cluster spherical structure, you could try k-means clustering with k tentatively identified as 2.

Here is SPSS code for k-means analysis of the standardized input variables:

QUICK CLUSTER Zeruption Zwaiting /MISSING=LISTWISE /CRITERIA=CLUSTER(2) MXITER(10) CONVERGE(0) /METHOD=KMEANS(NOUPDATE) /SAVE CLUSTER /PRINT INITIAL ANOVA CLUSTER DISTAN.

Key elements of the SPSS code are:

  • The input variables are the standardized forms of eruption and waiting
  • /CRITERIA specifies two clusters, as well as default values for iteration and convergence criteria
  • /SAVE specifies that cluster membership should be saved in a new variable that will be added to the active file
  • /PRINT ...

Get Data Analysis with IBM SPSS Statistics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.