Storing and processing Hive data in the ORC file format

I'm sure that most of the time, you would have created Hive tables and stored data in a text format; in this recipe, we are going store data in ORC files.

Getting ready

To perform this recipe, you should have a running Hadoop cluster as well as the latest version of Hive installed on it. Here, I am going to use Hive 1.2.1.

How to do it...

Hive 1.2.1 supports different types of files, which help process data in a fast manner. In this recipe, we are going to use ORC files to store data in Hive. To store the data in ORC files, we first need to create a Hive table that stores the data in a textual format. We will use the same table that we created in the first recipe.

Creating a table to store ORCFILE ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.