Apache Parquet is an efficient, structured, column-oriented (also called columnar storage), compressed, binary file format. Parquet supports several compression codecs, including Snappy, GZIP, deflate, and BZIP2. Snappy is the default. Structured file formats such as RCFile, Avro, SequenceFile, and Parquet offer better performance with compression support, which reduces the size of the data on the disk and consequently the I/O and CPU resources required to deserialize data.
In columnar storage , all the column data for a particular column is stored adjacently instead of by row. Columnar ...