Time for action – WordCount, the Hello World of MapReduce
Many applications, over time, acquire a canonical example that no beginner's guide should be without. For Hadoop, this is WordCount – an example bundled with Hadoop that counts the frequency of words in an input text file.
- First execute the following commands:
$ hadoop dfs -mkdir data $ hadoop dfs -cp test.txt data $ hadoop dfs -ls data Found 1 items -rw-r--r-- 1 hadoop supergroup 16 2012-10-26 23:20 /user/hadoop/data/test.txt
- Now execute these commands:
$ Hadoop Hadoop/hadoop-examples-1.0.4.jar wordcount data out 12/10/26 23:22:49 INFO input.FileInputFormat: Total input paths to process : 1 12/10/26 23:22:50 INFO mapred.JobClient: Running job: job_201210262315_0002 12/10/26 23:22:51 INFO ...
Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.