The advanced Pig operators
In this section, we will examine some of the advanced features and hints available in Pig operators.
The advanced FOREACH operator
The FOREACH
operator is primarily used to transform every record of the input relation into a transformed record. A list of expressions is used to make this transformation. There are situations where the FOREACH
operator can increase the number of output records. They are discussed in the following sections.
The FLATTEN operator
The FLATTEN
keyword is an operator, though it looks like a UDF in syntax. It is used to un-nest nested tuples and bags. However, the semantics of the elimination of nesting is different when it is used on tuples when compared to bags.
FLATTEN
on a nested tuple yields ...
Get Mastering Hadoop now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.