O'Reilly logo

Hadoop MapReduce v2 Cookbook - Second Edition by Thilina Gunarathne

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Writing multiple outputs from a MapReduce computation

We can use the MultipleOutputs feature of Hadoop to emit multiple outputs from a MapReduce computation. This feature is useful when we want to write different outputs to different files and also when we need to output an additional output in addition to the main output of a job. The MultipleOutputs feature allows us to specify a different OutputFormat for each output as well.

How to do it...

The following steps show you how to use the MultipleOutputs feature to output two different datasets from a Hadoop MapReduce computation:

  1. Configure and name the multiple outputs using the Hadoop driver program:
    Job job = Job.getInstance(getConf(), "log-analysis"); … FileOutputFormat.setOutputPath(job, new Path(outputPath)); ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required