O'Reilly logo

Apache Spark for Data Science Cookbook by Padma Priya Chitturi

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 7. Working with Sparkling Water - H2O

In this chapter, you will learn the following recipes:

  • Working with H2O on Spark

    Downloading and installing H2O

    Using H2O API in Spark

  • Implementing k-means using H2O over Spark
  • Implementing spam detection with Sparkling Water
  • Deep learning with airlines and weather data
  • Implementing a crime detection application
  • Running SVM with H2O over Spark

Introduction

H2O is a fast, scalable, open-source machine learning and deep learning library for smarter applications. Using in-memory compression, H2O handles billions of data rows in memory, even with a small cluster. In order to create complete analytic workflows, H2O's platform includes interfaces for R, Python, Scala, Java, JSON and CoffeeScript/JavaScript flows, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required