O'Reilly logo

Apache Mahout Clustering Designs by Ashish Gupta

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Launching the Mahout job on the cluster

Mahout has a script under the bin folder of the installation. Notice line 120 onwards of the following script:

# CLASSPATH initially contains $MAHOUT_CONF_DIR, or defaults to $MAHOUT_HOME/src/conf
CLASSPATH=${CLASSPATH}:$MAHOUT_CONF_DIR

if [ "$MAHOUT_LOCAL" != "" ]; then
echo "MAHOUT_LOCAL is set, so we don't add HADOOP_CONF_DIR to classpath."
elif [ -n "$HADOOP_CONF_DIR"  ] ; then
echo "MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath."
  CLASSPATH=${CLASSPATH}:$HADOOP_CONF_DIR
fi

We can set HADOOP_HOME and HADOOP_CONF_DIR to launch the Mahout job (algorithm) on the Hadoop cluster.

Just before the algorithm command, set the two previously mentioned parameters using the export command:

export HADOOP_HOME=<ur ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required