O'Reilly logo

Hadoop MapReduce v2 Cookbook - Second Edition by Thilina Gunarathne

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 7. Hadoop Ecosystem II – Pig, HBase, Mahout, and Sqoop

In this chapter, we will cover the following topics:

  • Getting started with Apache Pig
  • Joining two datasets using Pig
  • Accessing a Hive table data in Pig using HCatalog
  • Getting started with Apache HBase
  • Data random access using Java client APIs
  • Running MapReduce jobs on HBase
  • Using Hive to insert data into HBase tables
  • Getting started with Apache Mahout
  • Running K-means with Mahout
  • Importing data to HDFS from a relational database using Apache Sqoop
  • Exporting data from HDFS to a relational database using Apache Sqoop

Introduction

Hadoop ecosystem has a family of projects that are either built on top of Hadoop or work very closely with Hadoop. These projects have given rise to an ecosystem that focuses ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required