O'Reilly logo

Disruptive Analytics: Charting Your Strategy for Next-Generation Business Analytics by Thomas W. Dinsmore

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

© Thomas W. Dinsmore 2016

Thomas W. Dinsmore, Disruptive Analytics, 10.1007/978-1-4842-1311-7_4

4. The Hadoop Ecosystem

Disrupting from Below

Thomas W. Dinsmore

(1)Newton, Massachusetts, USA

In 2003, Doug Cutting and Mike Cafarella struggled to build a web crawler to search and index the entire Internet. They needed a way to distribute the data over multiple machines, because there was too much data for a single machine.

To keep costs low, they wanted to use inexpensive commodity hardware. That meant they would need fault-tolerant software, so if any one machine failed, the system could continue to operate.

Early in their work, they ruled out using a relational database. Their data included diverse data structures and data types, without a predefined ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required