O'Reilly logo

Programming Hive by Jason Rutherglen, Dean Wampler, Edward Capriolo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 9. Schema Design

Hive looks and acts like a relational database. Users have a familiar nomenclature such as tables and columns, as well as a query language that is remarkably similar to SQL dialects they have used before. However, Hive is implemented and used in ways that are very different from conventional relational databases. Often, users try to carry over paradigms from the relational world that are actually Hive anti-patterns. This section highlights some Hive patterns you should use and some anti-patterns you should avoid.

Table-by-Day

Table-by-day is a pattern where a table named supply is appended with a timestamp such as supply_2011_01_01, supply_2011_01_02, etc. Table-by-day is an anti-pattern in the database world, but due to common implementation challenges of ever-growing data sets, it is still widely used:

hive> CREATE TABLE supply_2011_01_02 (id int, part string, quantity int);

hive> CREATE TABLE supply_2011_01_03 (id int, part string, quantity int);

hive> CREATE TABLE supply_2011_01_04 (id int, part string, quantity int);

hive> .... load data ...

hive> SELECT part,quantity supply_2011_01_02
    > UNION ALL
    > SELECT part,quantity from supply_2011_01_03
    > WHERE quantity < 4;

With Hive, a partitioned table should be used instead. Hive uses expressions in the WHERE clause to select input only from the partitions needed for the query. This query will run efficiently, and it is clean and easy on the eyes:

hive> CREATE TABLE supply (id int, part string, quantity int)
    

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required