Aggregating sources in Accumulo using MapReduce
In this recipe, we will use MapReduce
AccumuloInputFormat class to count occurrences of each unique source stored in an Accumulo table.
This recipe will be the easiest to test over a pseudo-distributed Hadoop cluster with Accumulo 1.4.1 and Zookeeper 3.3.3 installed. The shell script in this recipe assumes that Zookeeper is running on the host
localhost and on the port
2181; you can change this to suit your environment needs. The Accumulo installation's
bin folder needs to be on your environment path.
For this recipe you'll need to create an Accumulo instance named
test with user as
root and password as
You will need a table by the name
acled to exist in the configured ...