Skip to Content
Apache Hive Cookbook
book

Apache Hive Cookbook

by Hanish Bansal, Saurabh Chauhan, Shrey Mehrotra
April 2016
Beginner content levelBeginner
268 pages
5h 32m
English
Packt Publishing
Content preview from Apache Hive Cookbook

Using the built-in User-defined Aggregation Function (UDAF)

Hive provides a set of functions to do aggregation on a dataset. These functions operate on a range of data (rows) and provide the cumulative or relative result.

How to do it…

The built-in functions could be used directly in the query. The following are some of the examples of aggregated functions available in Hive:

Function Name

Return Type

Description

avg(col)

DOUBLE

It is used to calculate the average of all values of a particular column.

avg(DISTINCT col)

DOUBLE

It is used to calculate the average of unique values of a particular column.

collect_list(col)

ARRAY

It will return a list of all values of a particular column in an array.

collect_set(col)

ARRAY

It will return ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Introduction to Apache Hive

Introduction to Apache Hive

Tom Hanlon

Publisher Resources

ISBN: 9781782161080Supplemental Content