August 2016
Beginner to intermediate
262 pages
8h 21m
English
© Thomas W. Dinsmore 2016
Thomas W. Dinsmore, Disruptive Analytics, 10.1007/978-1-4842-1311-7_4
Thomas W. Dinsmore1
(1)Newton, Massachusetts, USA
In 2003, Doug Cutting and Mike Cafarella struggled to build a web crawler to search and index the entire Internet. They needed a way to distribute the data over multiple machines, because there was too much data for a single machine.
To keep costs low, they wanted to use inexpensive commodity hardware. That meant they would need fault-tolerant software, so if any one machine failed, the system could continue to operate.
Early in their work, they ruled out using a relational database. Their data included diverse data structures and data types, without a predefined ...