O'Reilly logo

OpenStack Sahara Essentials by Omar Khedher

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Running jobs in Sahara

In the previous chapter, we had a running Hadoop cluster with one Master and three Worker nodes on top of OpenStack. Be aware that running any job type in Sahara requires an Active state of the provisioned cluster.

Executing jobs via Horizon

Since we intend to use Swift for input and output data, the first example will illustrate how to neaten a simple text file by trimming and removing space in each line. The text file looks like the following:

       OpenStack
EDP
    Sahara
      Swift
Jobs

To do so, we will execute a Pig Job in the Sahara cluster and designate the location of the text file in Swift named input. The Pig script might look like the following:

I = load '$INPUT' using PigStorage(':') as (cloud: chararray); O = foreach I generate ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required