CHAPTER 6

image

Advanced MapReduce Development

Chapter 5 discussed the basics of MapReduce from the perspective of familiar SQL concepts. You learned how MapReduce can be used to solve familiar problems. You also learned how data is read from input files, processed in the Mappers, routed to the Reducers using a Partitioner, and finally processed in the Reducer and written to output files in the HDFS.

This chapter tackles the Sort and Join features of SQL, which require an introduction to more-complex concepts underlying MapReduce programs. You learn about how multiple output files can be written to from a single MapReduce program. Finally, you learn ...

Get Pro Apache Hadoop, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.