Writing a custom SerDe (Intermediate)

In addition to built-in support for several file formats, Hive allows users to write their own custom serialization and deserialization code.

While exploring complex types, we briefly discussed how using maps over a key-value file format allows for a very flexible schema. One downside of that method is that the contents of each row are hidden as the keys of the map. In this recipe, we will write the serialization and deserialization code necessary to view these maps as normal columns of a table. This SerDe will maintain the flexibility of having a schema-less file format with the readability of a columnar table.

How to do it...

We first need to create an implementation of the SerDe class for our new file format. ...

Get Instant Apache Hive Essentials How-to now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.