Get full access to Scala Machine Learning Projects and 60K+ other titles, with a free 10-day trial of O'Reilly.

There are also live events, courses curated by job role, and more.

Using H2O for ethnicity prediction

Up to this point, we have seen how to cluster genetic variants. We have also used the Elbow method and found the number of optimal k, the tentative number cluster. Now we should explore another task that we planned at the beginning—that is, ethnicity prediction.

In the previous K-means section, we prepared a Spark DataFrame named schemaDF. That one cannot be used with H2O. However, an additional conversion is necessary. We use the asH2OFrame() method to convert the Spark DataFrame into an H2O frame:

val dataFrame = h2oContext.asH2OFrame(schemaDF)

Now, one important thing you should remember while using H2O is that if you do not convert the label column into categorical, it will treat the classification task ...

Get Scala Machine Learning Projects now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Don’t leave empty-handed

Get Mark Richards’s Software Architecture Patterns ebook to better understand how to design components—and how they should interact.

It’s yours, free.

Get it now

Check it out now on O’Reilly

Dive in for free with a 10-day trial of the O’Reilly learning platform—then explore all the other resources our members count on to build skills and solve problems every day.

Start your free trial Become a member now