Chapter 4. How can I characterize my customers from the mix of products that they purchase? 79
A simple visual inspection of Figure 4-8 and Figure 4-9 using the Shopper Type
as a guide seems to show that other than for the cluster number assigned and
the order in which they are arranged in the two output visualizers, the two sets
cluster results are very similar.
However, on closer inspection we will see that this is somewhat superficial and
that the segments produced are not as similar as this simple analysis would
suggest. So what is happening and how do we understand what the two sets of
results really mean? In the next section we explain it all.
4.6 Interpreting the results
In the previous section we looked at the steps we have to follow to get our mining
results using two different clustering techniques. The
sixth stage in our generic
mining method
is to interpret the results that we have obtained and determine
how we can map them onto our business. When you are first confronted with the
cluster results the first question that you are going to ask is “What does it all
mean?”. In this section we describe how to understand and read and interpret
the results from the different clustering techniques, but more importantly how you
can compare the results from different cluster techniques.
4.6.1 How to read and interpret the cluster results?
The cluster techniques that we have used both produce results that can be
displayed graphically, as we have seen. We can also obtain additional visual
information by highlighting individual clusters and even individual variables.
There is also another level of detailed information which gives us the statistical
information that we need to fully interpret what the clusters are telling us about
our customers. Although the visualized results are important in giving us an
overall impression of what is happening, the interpretation always needs to be
backed up with the statistical detail to confirm our understanding. In this section
we look at how this is done.
As we discussed in 4.2, “The data to be used” on page 49, the first thing you
need to understand about the visualized results is that the graphs and charts are
telling you about the characteristics of customers who have been put into the
cluster and how these customers differ from the population as a whole. It is not
telling you directly how the customers in one cluster differ from customers in
another cluster. This is an important distinction and we will return to this issue
later.