74 Large Scale and Big Data
intermediate sorting step. Tenzing also implements a block-based shufe
mechanism that combines many small rows into compressed blocks, which
is treated as one row to avoid reducer side sorting and avoid some of the
overhead associated with row serialization and deserialization in the under-
lying MapReduce framework code.
2.4.5 Cheetah
The Cheetah system [35] has been introduced as a custom data warehouse solution
that has been built on top of the MapReduce framework. In particular, it denes a
virtual view on top of the common star or snowake data warehouse schema and
applies a stack of optimization techniques on top of the MapReduce framework
including: data compression, optimized access methods, multiquery opti ...