Table and partition statistics in Hive

The first development in statistical computation is to support tables and partition-level statistics. With other metadata, the table and partition statistics are also stored in a configured metastore. The statistics are supported for both existing and new tables. The following are the statistics currently supported for tables and partitions:

  • The number of rows
  • The number of files
  • Size in bytes
  • Max, min, and average row sizes
  • Max, min, and average file sizes
  • The number of partitions (in the case of tables)

Getting ready

This recipe requires Hive installed as described in the Installing Hive recipe of Chapter 1, Developing Hive. You will also need Hive CLI or the beeline client to run the commands.

How to do it…

For ...

Get Apache Hive Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.