7.4 Clustering ConCepts 285
make, marital status, and zip code are all valid choices for variables because
they all directly or indirectly affect the number of claims. On the other hand,
the inclusion of a variable such as the height or weight of an automobile
may adversely affect the outcome of the categorization because they are not
relevant to the problem. Therefore, a potential investigator has many choices
when deciding the strategy of the initial process. How many variables should
an investigator choose to adequately categorize an object? The answer is
the fewer the better to adequately address the problem since the inclusion of
irrelevant variables drastically affects the outcome. If a single variable will
bring about a straightforw ...