Storing and processing Hive data in a sequential file format

I'm sure that most of the time, you would have created Hive tables and stored data in a text format; in this recipe, we are going to store data in sequential files.

Getting ready

To perform this recipe, you should have a running Hadoop cluster as well as the latest version of Hive installed on it. Here, I am going to use Hive 1.2.1.

How to do it...

Hive 1.2.1 supports various different types of files, which help process data in a faster manner. In this recipe, we are going to use sequential files to store data in Hive. To store data in sequential files, we first need to create a Hive table that stores the data in a textual format:

create table employee( id int, name string) row format delimited ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Hadoop: Data Processing and Modelling by Garry Turkington, Tanmay Deshpande, Sandeep Karanth

Storing and processing Hive data in a sequential file format

Getting ready

How to do it...

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly