Skip to Main Content
Large Scale and Big Data
book

Large Scale and Big Data

by Sherif Sakr, Mohamed Gaber
June 2014
Intermediate to advanced content levelIntermediate to advanced
636 pages
23h 13m
English
Auerbach Publications
Content preview from Large Scale and Big Data
550 Large Scale and Big Data
the large text data set. The rst word of each line in both types of le serves as the
join key. The map program emits the lines of the input large and small les. Each
line of the small le is labeled so that they can be distinguished from the map output.
In the reduce, the lines are checked to nd those with matched keys. If the lines from
both les are found to be matched, a Cartesian product is applied between the two
sets of lines with the same key to generate the output. Depending on the key distribu-
tion, the size of output data may vary. In the reduce program, assume there is λ lines
from the large le and μ lines from the small le. The result of Cartesian product
is λμ lines. Since μ ≤ 50 is very sma
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Reinventing the Organization for GenAI and LLMs

Reinventing the Organization for GenAI and LLMs

Ethan Mollick
Big Data Analytics for Internet of Things

Big Data Analytics for Internet of Things

Tausifa Jan Saleem, Mohammad Ahsan Chishti
Scala:Applied Machine Learning

Scala:Applied Machine Learning

Pascal Bugnion, Patrick R. Nicolas, Alex Kozlov
Topics in Parallel and Distributed Computing

Topics in Parallel and Distributed Computing

Sushil K Prasad, Anshul Gupta, Arnold L Rosenberg, Alan Sussman, Charles C Weems

Publisher Resources

ISBN: 9781466581500