O'Reilly logo

HBase: The Definitive Guide, 2nd Edition by Lars George

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 7. Hadoop Integration

Hadoop consists of two major components at heart: the file system (HDFS) and the processing framework (YARN). We have discussed in earlier chapters how HBase is using HDFS (if not configured otherwise) to keep the stored data safe, relying on the built-in replication of data blocks, transparent checksumming, as well as access control and security (the latter you will learn about in [Link to Come]). In this chapter we will look into how HBase is fitting nicely into the processing side of Hadoop as well.

Framework

The primary purpose of Hadoop is to store data in a reliable and scalable manner, and in addition provide means to process the stored data efficiently. That latter task is usually handed to YARN, which stands for Yet Another Resource Negotiator, replacing the monolithic MapReduce framework in Hadoop 2.2. MapReduce is still present in Hadoop, but was split into two parts: a resource management framework named YARN, and a MapReduce application running on top of YARN.

The difference is that in the past (before Hadoop 2.2), MapReduce was the only native processing framework in Hadoop. Now with YARN you can execute any processing methodology, as long as it can be implemented as a YARN application. MapReduce’s processing architecture has been ported to YARN as MapReduce v2, and effectively runs the same code as it always did. What became apparent though over time is that there is a need for more complex processing, one that allows to solve other ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required