O'Reilly logo

HDInsight Essentials - Second Edition by Rajesh Nadipalli

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Storm

Apache Storm is a scalable, fault-tolerant, distributed, real-time computation system. Storm makes it easy to reliably process streams of data. Storm has many use cases: real-time analytics, online machine learning, continuous computation, ETL, and others. Storm can process over 1 million tuples per second per node. The following are the key features of Storm:

  • Real-time computation
  • Guarantees data will be processed
  • Scalable
  • Fault tolerant

    Note

    At the time this book was authored, Storm is a preview feature in Azure HDInsight.

Storm positioning in Data Lake

Hadoop and MapReduce provide a great batch processing capability. HBase provides the low latency store. Storm provides low latency transformation so that real-time processing can be performed on ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required