Hadoop Application Architectures

Errata for Hadoop Application Architectures

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".

The following errata were submitted by our customers and approved as valid errors by the author or editor.

Color key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted By Date submitted Date corrected
Printed, PDF, ePub, Mobi, , Other Digital Version
Page Figure 2-3, Page 53

The image for single hop is incorrect. The image should have one less hop. Thanks to Aleks Shulman for pointing this out.

Mark Grover
Mark Grover
Aug 09, 2015 
Printed, PDF, ePub, Mobi, , Other Digital Version
Page ix
4th paragraph, 2nd sentence

"Its gained fine-grained security controls, ..."

should read

"It has gained fine-grained security controls, ..."

Martin Hammer  Nov 06, 2015 
Page 5
Binary data

The section on Binary data says "If the splittable unit of binary data is larger than 64 MB, you may consider putting the data in its own file, without using a container format."

I have a different opinion here, I don't think that is always the case. Here is my argument, if the file size is large say 10 GB, we can store it is as is i.e. without any container format like SequenceFile.
But suppose we have many (around 10,000) binary files each has 300 MB (larger than 64 MB), we are still benefited by storing it in container like SequenceFile because that way we have one large SequenceFile containing small files (where Key is file name and Value is the content of the file). Now if we have stored this file as is we are increasing the memory footprint of Namenode (1 big file Vs 10,000 small file). Although we don't have appreciable gains in terms of space but we have substantial improvement in terms of memory of the Namenode which I think is a major architectural decision.

Anonymous  Apr 25, 2016 
Page 51
4th paragraph 2nd bullet

"it’s not possible to parallize file transfers"

I think it should read:
it’s not possible to parallelize file transfers

Note from the Author or Editor:
I just fixed it in our source files on 6/30.

Chris Lim  Mar 11, 2015 
, Printed, PDF, ePub, Mobi, , Other Digital Version
Page 133

In the Python section, we refer to Uri Laserson's work as if it was a book from O'Reilly. As it turns out, it's not a book, it's simply a blog.

The offending line is referring to 'Python Frameworks for Hadoop'. Thanks to Kathleen Ting for reporting this.

Mark Grover
Mark Grover
Aug 25, 2015 
, Printed, PDF, ePub, Mobi, , Other Digital Version
Page 285

The diagram says 'Card Swapping device', it should be 'Card Swiping device'.

Mark Grover
Mark Grover
Sep 04, 2015