Azure Blob storage is the default and preferred way to store data in HDInsight. HDInsight supports the Hadoop distributed file system (HDFS) as well as Azure Blob storage for storing data. This chapter covers uploading data to Blob storage and executing MapReduce jobs on it. It starts with different command-line utilities to upload data and looks at a couple of graphical clients. You’ll create your first MapReduce job and execute it using PowerShell. Also, you’ll look at .NET SDK to create and execute job on HDInsight. And finally, you’ll learn about Avro serialization. ...
© Vinit Yadav 2017
Vinit Yadav, Processing Big Data with Azure HDInsight, 10.1007/978-1-4842-2869-2_3
3. Working with Data in HDInsight
Vinit Yadav1
(1)Ahmedabad, Gujarat, India
Get Processing Big Data with Azure HDInsight: Building Real-World Big Data Systems on Azure HDInsight Using the Hadoop Ecosystem now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.