November 2018
Intermediate to advanced
360 pages
9h 36m
English
This is by far the more experimental recipe of this book, but for very demanding datasets, Parquet is probably one of the more efficient formats available and has native support in Spark. In the medium to long-term, we will probably see developments in this space. You should expect other ways of converting data into the Parquet format to appear. If you decide to use Parquet, be sure to check all the different indexing strategies that the format supports.