Grouping and counting column values with Hive
Once a schema is defined inside Hive, many different ad hoc queries can be run against it. Based on the submitted query, Hive generates a plan that may be one or more MapReduce jobs. This recipe shows how to group and count on the values of a specified column using Hive.
How to do it...
- Insert some entries ensuring that the '
favorite_movie
' column is populated:[default@ks33] set cf33['ed']['favorite_movie']='memento'; [default@ks33] set cf33['stacey']['favorite_movie']='drdolittle'; [default@ks33] set cf33['bob']['favorite_movie']='memento';
- Create an HQL query that will count values of the
favorite_movie
column and then order the counts in ascending order:hive> SELECT favorite_movie,count(1) as ...
Get Cassandra High Performance Cookbook now with O’Reilly online learning.
O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.