September 2016
Intermediate to advanced
316 pages
6h 43m
English
Now our MapReduce program is ready to run on the Hadoop cluster. We are now going to prepare the input data from the customer master database of Furnitica. The customer master data contains many details that might not be very relevant for our MapReduce job.
A subset of fields available in the master data is as follows:
Let us assume here that we will now make a selection of customers living in the city where we are going to send the campaign folders. This city is the target of the campaign. A single row in our selection is shown in Table 3:
|
Customer ID |
10023 |
|
Age (derived from date of birth) |
55 |
|
Income |
75000 |
|
Gender (derived from M/F, where 0 is male and 1 is female) |
0 |
Table 3 A selection ...
Read now
Unlock full access