O'Reilly logo

Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data by EMC Education Services

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

10

Advanced Analytics—Technology and Tools: MapReduce and Hadoop

Key Concepts

Hadoop

Hadoop Ecosystem

MapReduce

NoSQL

Chapter 4, “Advanced Analytical Theory and Methods: Clustering,” through Chapter 9, “Advanced Analytical Theory and Methods: Text Analysis,” covered several useful analytical methods to classify, predict, and examine relationships within the data. This chapter and Chapter 11, “Advanced Analytics—Technology and Tools: In-Database Analytics,” address several aspects of collecting, storing, and processing unstructured and structured data, respectively. This chapter presents some key technologies and tools related to the Apache Hadoop software library, “a framework that allows for the distributed processing of large datasets across clusters of computers using simple programming models” [1].

This chapter focuses on how Hadoop stores data in a distributed system and how Hadoop implements a simple programming paradigm known as MapReduce. Although this chapter makes some Java-specific references, the only intended prerequisite knowledge is a basic understanding of programming. Furthermore, the Java-specific details of writing a MapReduce program for Apache Hadoop are beyond the scope of this text. This omission may appear troublesome, but tools in the Hadoop ecosystem, such as Apache Pig and Apache Hive, can often eliminate the need to explicitly code a MapReduce program. Along with other Hadoop-related tools, Pig and Hive are covered in a portion of this chapter dealing ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required