July 2017
Beginner to intermediate
378 pages
10h 26m
English
Hadoop is an open source effort that falls under the umbrella of the Apache Software Foundation. As defined by the official project documentation, The Apache Hadoop project develops open source software for reliable, scalable, distributed computing. Hadoop is available for free in its pure open source form.
Unless you have some Hadoop experts on your team, you should opt for one of the managed Hadoop distributions. This will give you a level of troubleshooting support and implementation advice. Cloudera and Hortonworks are two main providers of managed distributions and support. Amazon AWS and Microsoft Azure both have their own Hadoop managed services, EMR and HDInsights respectively.
Hadoop is a little difficult to describe at first ...