This recipe describes how to run a MapReduce computation in a distributed Hadoop v2 cluster.
Start the Hadoop cluster by following the Setting up HDFS recipe or the Setting up Hadoop ecosystem in a distributed cluster environment using a Hadoop distribution recipe.
Now let's run the WordCount sample in the distributed Hadoop v2 setup:
wc-inputdirectory in the source repository to the HDFS filesystem. Alternatively, you can upload any other set of text documents as well.
$ hdfs dfs -copyFromLocal wc-input .
$ hadoop jar hcb-c1-samples.jar \ chapter1.WordCount \ wc-input wc-output