UDF, UDAF, and UDTF

Like in Pig, UDFs are one of the most important extensibility features in Hive. Writing a UDF in Hive is simpler, but the interfaces do not define every override method that is needed to make the UDF complete. This is because UDFs can take any number of parameters, and it is difficult to provide a fixed interface. Hive uses Java reflection under the hood when executing the UDF to figure out the parameter list for the function.

These are the following three kinds of UDFs in Hive:

  • Regular UDFs: These UDFs take in a single row and produce a single row after application of the custom logic.
  • UDAFs: These are aggregators that take in multiple rows but output a single row. SUM and COUNT are examples of in-built UDAFs.
  • UDTFs: These are ...

Get Mastering Hadoop now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.