Storm
Apache Storm is a scalable, fault-tolerant, distributed, real-time computation system. Storm makes it easy to reliably process streams of data. Storm has many use cases: real-time analytics, online machine learning, continuous computation, ETL, and others. Storm can process over 1 million tuples per second per node. The following are the key features of Storm:
- Real-time computation
- Guarantees data will be processed
- Scalable
- Fault tolerant
Note
At the time this book was authored, Storm is a preview feature in Azure HDInsight.
Storm positioning in Data Lake
Hadoop and MapReduce provide a great batch processing capability. HBase provides the low latency store. Storm provides low latency transformation so that real-time processing can be performed on ...
Get HDInsight Essentials - Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.