Skip to Content
Programming Pig
book

Programming Pig

by Alan Gates
October 2011
Intermediate to advanced content levelIntermediate to advanced
220 pages
6h 25m
English
O'Reilly Media, Inc.
Content preview from Programming Pig

Chapter 4. Pig’s Data Model

Before we take a look at the operators that Pig Latin provides, we first need to understand Pig’s data model. This includes Pig’s data types, how it handles concepts such as missing data, and how you can describe your data to Pig.

Types

Pig’s data types can be divided into two categories: scalar types, which contain a single value, and complex types, which contain other types.

Scalar Types

Pig’s scalar types are simple types that appear in most programming languages. With the exception of bytearray, they are all represented in Pig interfaces by java.lang classes, making them easy to work with in UDFs:

int

An integer. Ints are represented in interfaces by java.lang.Integer. They store a four-byte signed integer. Constant integers are expressed as integer numbers, for example, 42.

long

A long integer. Longs are represented in interfaces by java.lang.Long. They store an eight-byte signed integer. Constant longs are expressed as integer numbers with an L appended, for example, 5000000000L.

float

A floating-point number. Floats are represented in interfaces by java.lang.Float and use four bytes to store their value. You can find the range of values representable by Java’s Float type at http://java.sun.com/docs/books/jls/third_edition/html/typesValues.html#4.2.3. Note that because this is a floating-point number, in some calculations it will lose precision. For calculations that require no loss of precision, you should use an int or long instead. Constant floats are ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Programming Pig, 2nd Edition

Programming Pig, 2nd Edition

Alan Gates, Daniel Dai
Pig Design Patterns

Pig Design Patterns

Pradeep Pasupuleti
Apache Hadoop™ YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop™ 2

Apache Hadoop™ YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop™ 2

Arun C. Murthy, Vinod Kumar Vavilapalli, Doug Eadline, Joseph Niemiec, Jeff Markham

Publisher Resources

ISBN: 9781449317881Errata Page