©  Vinit Yadav 2017

Vinit Yadav, Processing Big Data with Azure HDInsight, 10.1007/978-1-4842-2869-2_3

3. Working with Data in HDInsight

Vinit Yadav

(1)Ahmedabad, Gujarat, India

Azure Blob storage is the default and preferred way to store data in HDInsight. HDInsight supports the Hadoop distributed file system (HDFS) as well as Azure Blob storage for storing data. This chapter covers uploading data to Blob storage and executing MapReduce jobs on it. It starts with different command-line utilities to upload data and looks at a couple of graphical clients. You’ll create your first MapReduce job and execute it using PowerShell. Also, you’ll look at .NET SDK to create and execute job on HDInsight. And finally, you’ll learn about Avro serialization. ...

Get Processing Big Data with Azure HDInsight: Building Real-World Big Data Systems on Azure HDInsight Using the Hadoop Ecosystem now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.