Skip to Content
Data Lake for Enterprises
book

Data Lake for Enterprises

by Vivek Mishra, Tomcy John, Pankaj Misra
May 2017
Beginner to intermediate
596 pages
15h 2m
English
Packt Publishing
Content preview from Data Lake for Enterprises

Apache Hive

Apache Hive was created by Facebook, and provides data warehouse capability on top of Hadoop. Its main capability is data summarization and ad-hoc query execution on Hadoop.

Hive contains two components, namely these:

  • Hive Command Line: An interface used to execute HiveQL
  • JDBC (Java DataBase Connectivity)/ODBC (Object DataBase Connectivity) driver: This is to establish connectivity to the data storage

Query execution is done through uses of Hive Query Language (HQL or HiveQL), very much similar to SQL. Query results produced are performant and real time using various indexing capabilities. Apache Hive is capable of batch and real-time data processing alike.

Similar to Apache Pig, Hive also allows you to write User Defined Function ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

The Enterprise Big Data Lake

The Enterprise Big Data Lake

Alex Gorelik
Operationalizing the Data Lake

Operationalizing the Data Lake

Holden Ackerman, Jon King
Data Lakes

Data Lakes

Anne Laurent, Dominique Laurent, Cédrine Madera

Publisher Resources

ISBN: 9781787281349Supplemental Content