Vinit Yadav, Processing Big Data with Azure HDInsight, 10.1007/978-1-4842-2869-2_4

4. Querying Data with Hive

Vinit Yadav¹

(1)Ahmedabad, Gujarat, India

Hive is probably the most used tool in the Hadoop ecosystem. To work with Hadoop data, you need to write MapReduce jobs that are not convenient for ad hoc queries. Hive comes to the rescue by providing a SQL-like query language, which internally transforms the query to MapReduce jobs. In HDInsight, Hive sits on top of Azure Blob storage data and provides interactive queries to work with data. Hive can work with structured and semi-structured data. Hive resides on top of a YARN layer and makes use of all the resource negotiations that YARN does. Internally, it uses MapReduce, Tez, ...

Get Processing Big Data with Azure HDInsight: Building Real-World Big Data Systems on Azure HDInsight Using the Hadoop Ecosystem now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Processing Big Data with Azure HDInsight: Building Real-World Big Data Systems on Azure HDInsight Using the Hadoop Ecosystem by Vinit Yadav

4. Querying Data with Hive

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly