Chapter 1. Hadoop and HDInsight in a Heartbeat

This chapter will provide an overview of Apache Hadoop and Microsoft big data strategy, where Microsoft HDInsight plays an important role. We will cover the following topics:

  • The era of big data
  • Hadoop concepts
  • Hadoop distributions
  • HDInsight overview
  • Hadoop on Windows deployment options

Data is everywhere

We live in a digital era and are always connected with friends and family using social media and smartphones. In 2014, every second over 5,700 tweets were sent and 800 links were shared using Facebook and the digital universe was about 1.7 MB per minute for every person on Earth (source: IDC 2014 report). This amount of data sharing and storing is unprecedented and is contributing to what is known as

Get HDInsight Essentials - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.