Up to this point, we have seen how to cluster genetic variants. We have also used the Elbow method and found the number of optimal k, the tentative number cluster. Now we should explore another task that we planned at the beginning—that is, ethnicity prediction.
In the previous K-means section, we prepared a Spark DataFrame named schemaDF. That one cannot be used with H2O. However, an additional conversion is necessary. We use the asH2OFrame() method to convert the Spark DataFrame into an H2O frame:
val dataFrame = h2oContext.asH2OFrame(schemaDF)
Now, one important thing you should remember while using H2O is that if you do not convert the label column into categorical, it will treat the classification task ...