Like in Pig, UDFs are one of the most important extensibility features in Hive. Writing a UDF in Hive is simpler, but the interfaces do not define every override method that is needed to make the UDF complete. This is because UDFs can take any number of parameters, and it is difficult to provide a fixed interface. Hive uses Java reflection under the hood when executing the UDF to figure out the parameter list for the function.

These are the following three kinds of UDFs in Hive:

  • Regular UDFs: These UDFs take in a single row and produce a single row after application of the custom logic.
  • UDAFs: These are aggregators that take in multiple rows but output a single row. SUM and COUNT are examples of in-built UDAFs.
  • UDTFs: These are ...

Get Mastering Hadoop now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.