Apache Hive is a data warehouse framework for querying and managing large datasets stored in Hadoop distributed filesystems (HDFS) . Hive also provides a SQL-like query language called HiveQL . The HiveQL queries may be run in the Hive CLI shell . By default, Hive stores data in the HDFS, but also supports the Amazon S3 filesystem.
Hive stores data in tables. A Hive table is an abstraction and the metadata for a Hive table is stored in an embedded Derby database called a Derby metastore. Other databases such as MySQL and Oracle Database could also be configured as the Hive metastore ...