Serialization and deserialization formats and data types
Serialization and deserialization formats are popularly known as SerDes. Hive allows the framework to read or write data in a particular format. These formats parse the structured or unstructured data bytes stored in HDFS in accordance with the schema definition of Hive tables. Hive provides a set of in-built SerDes
and also allows the user to create custom SerDes
based on their data definition. These are as follows:
LazySimpleSerDe
RegexSerDe
AvroSerDe
OrcSerde
ParquetHiveSerDe
JSONSerDe
CSVSerDe
How to do it…
You can use different types of SerDes
for reading or writing the data in a particular format.
LazySimpleSerDe
This is the default SerDes
format of Hive. When a user creates a table in Hive without ...
Get Apache Hive Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.