Classification (or Categorization) and Symbolization
Basic statistics is one way of understanding bunches of numbers. Another way is to place things in categories, based on some pertinent number associated with each thing. For example, we could categorize (i.e., place into separate groups) the students in a class according to their heights. Put everybody less than 4 feet tall in group A. Those 4 feet or more but less than 4 feet, 3 inches belong in category B. Those who are 4 feet, 3 inches or more but less than 4 feet, 6 inches go in category C, and so on. So here’s the general concept: We have k objects and we place each of them in one, and one only, of n categories, where n is less than (or, in a trivial case, equal to) k. Almost without saying, the first category consists of a set of smallest numbers, the next category consists of the set of next smallest of numbers, and so on.
Consider another example: Suppose that we had a set of numbers (which I have put in order to make things simpler).
1 1 2 3 3 4 5 7 8 10 10 11 12 15 19 19 22 23 25
That is, we could place all the numbers from low to high, say, in a text string from left to right.
If we wanted three categories we might partition them as follows by assigning obvious breaks between categories:
1 1 2 3 3 4 5 |
7 8 10 10 11 12 15 |
19 19 22 23 25 |
Category 1 |
Category 2 |
Category 3 |
Here the simplicity ends. The goal is to arrange things so that humans can best understand the nature of whatever is being studied. ...