Chapter 3. Data Types and File Formats
Hive supports many of the primitive data types you find in relational databases, as well as three collection data types that are rarely found in relational databases, for reasons we’ll discuss shortly.
A related concern is how these types are represented in text files, as well as alternatives to text storage that address various performance and other concerns. A unique feature of Hive, compared to most databases, is that it provides great flexibility in how data is encoded in files. Most databases take total control of the data, both how it is persisted to disk and its life cycle. By letting you control all these aspects, Hive makes it easier to manage and process data with a variety of tools.
Primitive Data Types
Hive supports several sizes of integer and floating-point types, a Boolean type, and character strings of arbitrary length. Hive v0.8.0 added types for timestamps and binary fields.
Table 3-1 lists the primitive types supported by Hive.
Table 3-1. Primitive data types
|Type||Size||Literal syntax examples|
1 byte signed integer.
2 byte signed integer.
4 byte signed integer.
8 byte signed integer.
Boolean true or false.
Single precision floating point.
Double precision floating point.
Sequence of characters. The character set can be specified. Single or double quotes can be used.
Integer, float, ...