Another aspect to manipulate data in Hive is to properly order or sort the data or result sets to clearly identify the important facts, such as top N values, maximum, minimum, and so on.
There are the following keywords used in Hive to order and sort data:
ORDER BY (ASC|DESC): This is similar to the RDBMS
ORDER BYstatement. A sorted order is maintained across all of the output from every reducer. It performs the global sort using only one reducer, so it takes a longer time to return the result. Usage with
LIMITis strongly recommended for
ORDER BY. When
hive.mapred.mode = strict(by default,
hive.mapred.mode = nonstrict) is set and we do not specify
LIMIT, there are exceptions. This can be used as follows:
jdbc:hive2://> SELECT name ...