Skip to Content
Apache Hive Cookbook
book

Apache Hive Cookbook

by Hanish Bansal, Saurabh Chauhan, Shrey Mehrotra
April 2016
Beginner content levelBeginner
268 pages
5h 32m
English
Packt Publishing
Content preview from Apache Hive Cookbook

HCatalog

In this recipe, you will learn how you can define tables in HCatalog.

Getting ready

HCatalog is a storage management tool that enables frameworks other than Hive to leverage a data model to read and write data. HCatalog tables provide an abstraction on the data format in HDFS and allow frameworks such as PIG and MapReduce to use the data without being concerned about the data format, such as RC, ORC, and text files.

HCatInputFormat and HCatOutputFormat, which are the implementations of Hadoop InputFormat and OutputFormat, are the interfaces provided to PIG and MapReduce.

How to do it…

Data is defined using the HCatalog CLI. Data is modeled as tables and tables are stored in databases. The table could be partitioned based on keys.

HCatalog DMLs ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Introduction to Apache Hive

Introduction to Apache Hive

Tom Hanlon

Publisher Resources

ISBN: 9781782161080Supplemental Content