© Jun Xu 2018
Jun XuBlock Trace Analysis and Storage System Optimizationhttps://doi.org/10.1007/978-1-4842-3928-5_8

8. Case Study: Hadoop

Jun Xu1 
(1)
Singapore, Singapore
 

Hadoop is one of the most popular distributed big data platforms in the world. Besides computing power, its storage subsystem capability is also a key factor in its overall performance. In particular, there are many intermediate file exchanges for MapReduce. This chapter presents the block-level workload characteristics of a Hadoop cluster by considering some specific metrics. The analysis techniques presented can help you understand the performance and drive characteristics of Hadoop in production environments. In addition, this chapter also identifies whether SMR drives are suitable ...

Get Block Trace Analysis and Storage System Optimization: A Practical Approach with MATLAB/Python Tools now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.