The data model

Hive data is organized as databases. A database is a logical collection of Hive tables. A database within Hive assigns a namespace for its tables. If no namespace is assigned to Hive tables, it belongs to the default namespace. The creation of a database results in the creation of an HDFS directory for the files in the database. This directory serves as the namespace for the tables. The CREATE DATABASE MasteringHadoop command creates a MasteringHadoop database. When we list the HDFS directory structure, we see a directory created for this database, as shown:

drwxr-xr-x   - sandeepkaranth supergroup          0 2014-05-15 08:55 /user/hive/warehouse/masteringhadoop.db

A table is the basic unit of data storage similar to traditional RDBMS. It ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.