O'Reilly logo

Programming Hive by Jason Rutherglen, Dean Wampler, Edward Capriolo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 15. Customizing Hive File and Record Formats

Hive functionality can be customized in several ways. First, there are the variables and properties that we discussed in Variables and Properties. Second, you may extend Hive using custom UDFs, or user-defined functions, which was discussed in Chapter 13. Finally, you can customize the file and record formats, which we discuss now.

File Versus Record Formats

Hive draws a clear distinction between the file format, how records are encoded in a file, the record format, and how the stream of bytes for a given record are encoded in the record.

In this book we have been using text files, with the default STORED AS TEXTFILE in CREATE TABLE statements (see Text File Encoding of Data Values), where each line in the file is a record. Most of the time those records have used the default separators, with occasional examples of data that use commas or tabs as field separators. However, a text file could contain JSON or XML “documents.”

For Hive, the file format choice is orthogonal to the record format. We’ll first discuss options for file formats, then we’ll discuss different record formats and how to use them in Hive.

Demystifying CREATE TABLE Statements

Throughout the book we have shown examples of creating tables. You may have noticed that CREATE TABLE has a variety of syntax. Examples of this syntax are STORED AS SEQUENCEFILE, ROW FORMAT DELIMITED , SERDE, INPUTFORMAT, OUTPUTFORMAT. This chapter will cover much of this syntax and give examples, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required