Chapter 13
Parallelizing Operations
IN THIS CHAPTER
Understanding why simply bigger, larger, and faster isn’t always the right solution
Looking inside the storage and computational approaches of Internet companies
Figuring out how using clusters of commodity hardware reduces costs
Reducing complex algorithms into separable parallel operations by MapReduce
Managing immense amounts of data using streaming or sampling strategies has clear advantages (as discussed in Chapter 12) when you have to deal with massive data processing. Using streaming and sampling algorithms helps you obtain a result even when your computational power is limited (for instance, when using your own computer). However, some costs are associated with these approaches:
- Streaming: Handles infinite amounts of data. Yet your algorithms perform at low speed because they process individual pieces of data and the stream speed rules the pace.
- Sampling: Applies any algorithms on any machine. Yet the obtained result is imprecise because you have only a probability, not a certainty, of getting the right answer. Most often, ...
Get Algorithms For Dummies now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.