O'Reilly logo

Hadoop MapReduce v2 Cookbook - Second Edition by Thilina Gunarathne

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Performing a join with Hive

This recipe will guide you on how to use Hive to perform a join across two datasets. The first dataset is the book details dataset of the Book-Crossing database and the second dataset is the reviewer ratings for those books. This recipe will use Hive to find the authors with the most number of ratings of more than 3 stars.

Getting ready

Follow the previous Hive batch mode – using a query file recipe.

How to do it...

This section demonstrates how to perform a join using Hive. Proceed with the following steps:

  1. Start the Hive CLI and use the Book-Crossing database:
    $ hive
    hive > USE bookcrossing;
    
  2. Create the books and book ratings tables by executing the create-book-crossing.hql Hive query file after referring to the previous ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required