O'Reilly logo

HBase High Performance Cookbook by Ruchir Choudhry

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Real-time data analysis using Hbase and Mahout

The data, which is readily available using different data streams, needs to be parsed and converted to a meaningful format, which essentially creates value for the business. Hbase integration with Mahout provides the stream that allows us to do clustering in machine learning, which is a way of programmatically orchestrating similar sets of data in a more organized and meaningful way.

Let's say that you have a fruit tasting session where you have different types of fruits, and you want to group people who liked apples, bananas, watermelons, and so on. A small set of data will look staggered, so we have to create some sort of algorithm to do it programmatically. Here is how the data will look before ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required