Skip to Content
Apache Hive Cookbook
book

Apache Hive Cookbook

by Hanish Bansal, Saurabh Chauhan, Shrey Mehrotra
April 2016
Beginner content levelBeginner
268 pages
5h 32m
English
Packt Publishing
Content preview from Apache Hive Cookbook

Using a map-side join

In this recipe, you will learn how to use a map-side joins in Hive.

While joining multiple tables in Hive, there comes a scenario where one of the tables is small in terms of rows while another is large. In order to produce the result in an efficient manner, Hive uses map-side joins. In map-side joins, the smaller table is cached in the memory while the large table is streamed through mappers. By doing so, Hive completes the joining at the mapper side only, thereby removing the reducer job. By doing so, performance is improved tremendously.

How to do it…

There are two ways of using map-side joins in Hive.

One is to use the /*+ MAPJOIN(<table_name>)*/ hint just after the select keyword. table_name has to be the table that is smaller ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Introduction to Apache Hive

Introduction to Apache Hive

Tom Hanlon

Publisher Resources

ISBN: 9781782161080Supplemental Content