The Benefits of Deploying Hadoop in a Private Cloud

Abstract

Hadoop is a popular framework used for nimble, cost-effective analysis of unstructured data. The global Hadoop market, valued at $1.5 billion in 2012, is estimated to reach $50 billion by 2020.[1] Companies can now choose to deploy a Hadoop cluster in a physical server environment, a private cloud environment, or in the public cloud. We have yet to see which deployment model will predominate during this growth period; however, the security and granular control offered by private clouds may lead this model to dominate for medium to large enterprises. When compared to other deployment models, a private cloud Hadoop cluster offers unique benefits:

  • A cluster can be set up in minutes
  • It can flexibly use a variety of hardware (DAS, SAN, NAS)
  • It is cost effective (lower capital expenses than physical deployment and lower operating expenses than public cloud deployment)
  • Streamlined management tools lower the complexity of initial configuration and maintenance
  • High availability and fault tolerance increase uptime

This report reviews the benefits of running Hadoop on a virtualized or aggregated (container-based) private cloud and provides an overview of best practices to maximize performance.

Introduction

Today, we are capable of collecting more data (and various forms of data) than ever before.[2] It may be the most valuable intangible asset of our time. The sheer volume (“big data”) and need for flexible, low-latency analysis ...

Get Hadoop Virtualization now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.