Configuring the infrastructure
First, we need to configure the infrastructure. Since Storm will run on the YARN infrastructure, we will first configure YARN and then show how to configure Storm-YARN for deployment on that cluster.
The Hadoop infrastructure
To configure a set of machines, you will need a copy of Hadoop residing on them or a copy that is accessible to each of them. First, download the latest copy of Hadoop and unpack the archive. For this example, we will use Version 2.1.0-beta.
Assuming that you have uncompressed the archive into /home/user/hadoop
, add the following environment variables on each of the nodes in the cluster:
export HADOOP_PREFIX=/home/user/hadoop export HADOOP_YARN_HOME=/home/user/hadoop export HADOOP_CONF_DIR=/home/user/hadoop/etc/Hadoop ...
Get Storm Blueprints: Patterns for Distributed Real-time Computation now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.