Skip to Content
Apache Hive Cookbook
book

Apache Hive Cookbook

by Hanish Bansal, Saurabh Chauhan, Shrey Mehrotra
April 2016
Beginner content levelBeginner
268 pages
5h 32m
English
Packt Publishing
Content preview from Apache Hive Cookbook

Chapter 8. Statistics in Hive

In previous chapters, you learned different types of joins in Hive and optimizations available in Hive joins.

In this chapter, we will cover the following recipes in detail:

  • Bringing statistics in to Hive
  • Table and partition statistics in Hive
  • Column statistics in Hive
  • Top K statistics in Hive

Bringing statistics in to Hive

Statistics in terms of the number of records in a table or partitions or histograms of a column is important. Also, it could help in query optimization. Statistical data is required as an input to many functions so that it can compare different plans. Statistics also help users by storing answers to some of the most frequently queried data and prevent long-running execution plans each time a query is ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Introduction to Apache Hive

Introduction to Apache Hive

Tom Hanlon

Publisher Resources

ISBN: 9781782161080Supplemental Content