21.1 Introduction

Several of the techniques we have discussed in Part III are computationally intensive. For example, the construction of the Lempel–Ziv–Welch (LZW) dictionary and quantized dictionary (QD) is time-consuming. Therefore, we need to address scalability issues of these algorithms. One possible solution is to adopt parallel/distributed computing. Here, we would like to exploit cloud computing based on commodity hardware. Cloud computing is a distributed parallel solution. For our approach, we utilize a Hadoop- and MapReduce-based framework to facilitate parallel computing.

This chapter will be organized in the following ways. First, we will discuss Hadoop/MapReduce in Section 21.2. Second, ...

Get Big Data Analytics with Applications in Insider Threat Detection now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.