Chapter 8. Extensibility Considerations

Although Hive has many built-in functions, users sometimes will need power beyond that provided by built-in functions. For these instances, Hive offers the following three main areas where its functionalities can be extended:

  • User-defined function (UDF): This provides a way to extend functionalities with an external function (mainly written in Java) that can be evaluated in HQL
  • Streaming: This plugs in users' own customized mappers and reducers programs in the data streaming
  • SerDe: This stands for serializers and deserializers and provides a way to serialize or deserialize a custom file format with files stored on HDFS

In this chapter, we'll talk about each of them in more detail.

User-defined functions

Hive defines ...

Get Apache Hive Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.