Chapter 8. Extensibility Considerations

Although Hive has many built-in functions, users sometimes will need power beyond that provided by built-in functions. For these instances, Hive offers the following three main areas where its functionalities can be extended:

  • User-defined function (UDF): This provides a way to extend functionalities with an external function (mainly written in Java) that can be evaluated in HQL
  • Streaming: This plugs in users' own customized mappers and reducers programs in the data streaming
  • SerDe: This stands for serializers and deserializers and provides a way to serialize or deserialize a custom file format with files stored on HDFS

In this chapter, we'll talk about each of them in more detail.

User-defined functions

Hive defines ...

Get Apache Hive Essentials now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.