Apache Pig is a platform to analyze large data sets using a procedural language known as Pig Latin. One of the challenges with MapReduce is that to represent complex processing, you have to create multiple MapReduce operations and then chain them together to achieve the desired result, which is not easy or maintainable when requirements change very often. Instead, you can use Pig, which represents transformations as a data flow. You can write different transformations, one after another, to achieve the desired result. Apache Pig is mainly used in data manipulation operations, ...
© Vinit Yadav 2017
Vinit Yadav, Processing Big Data with Azure HDInsight, 10.1007/978-1-4842-2869-2_5
5. Using Pig with HDInsight
Vinit Yadav1
(1)Ahmedabad, Gujarat, India
Get Processing Big Data with Azure HDInsight: Building Real-World Big Data Systems on Azure HDInsight Using the Hadoop Ecosystem now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.