54 Large Scale and Big Data
shared, then clearly only one stream needs to be generated. Even if only
some lters are common in both jobs, it is possible to share parts of the map
functions.
In practice, sharing scans and sharing map-output yield I/O savings while sharing
map functions (or parts of them) would yield additional CPU savings.
While the MRShare system focus on sharing the processing between queries that
are executed concurrently, the ReStore system [49,50] has been introduced so that it
can enable the queries that are submitted at different times to share the intermediate
results of previously executed jobs and reusing them for future submitted jobs to the
system. In particular, each MapReduce job produces output that is stored i ...