Let's return to our example of shoppers on the e-commerce store and consider how we might use the Pearson correlation coefficient to select data features. Consider the following example data, which records purchase amounts for shoppers given their time spent on site and the amount of money they had spent on purchases previously:
Purchase Amount |
Time on Site (seconds) |
Past Purchase Amount |
$10.00 |
53 |
$7.00 |
$14.00 |
220 |
$12.00 |
$18.00 |
252 |
$22.00 |
$20.00 |
571 |
$17.00 |
$22.00 |
397 |
$21.00 |
$34.00 |
220 |
$23.00 |
$38.00 |
776 |
$29.00 |
$50.00 |
462 |
$74.00 |
$52.00 |
354 |
$63.00 |
$56.00 |
23 |
$61.00 |
Of course, in a real application of this problem you may have ...