Appendix 3.A Descriptive Characteristics of Grouped Data
Let us assume that a data set has been summarized in the form of an absolute frequency distribution involving classes of a variable X and absolute class frequencies fj (see Chapter 2). This type of data set appears in Table 3.A.1. Quite often one is faced with analyzing a set of observations presented in this format. (For instance, some data sets contain proprietary information and the owner will only agree to present the data in summary form and not release the individual observation values. Or maybe the data set is just too large to have all of its observations printed in some report or document.) Given that the individual observations lose their identity in the grouping process, can we still find the mean, median, mode, standard deviation, and quantiles of X? We can, but not by using the formulas presented earlier in this chapter. In fact, we can only get “approximations” to the mean, median, and so on. However, as we shall soon see, these approximations are quite good. In what follows, we shall assume that we are dealing with a sample of size n.
Table 3.A.1 Absolute Frequency Distribution.
| Classes of X | fj |
| 20–29 | 3 |
| 30–39 | 7 |
| 40–49 | 8 |
| 20–59 | 12 |
| 60–69 | 10 |
| 70–79 | 6 |
| 80–89 | 4 |
| 50 |
3.A.1 The Arithmetic Mean
To approximate
, let us use the formula
where mj is the class mark (midpoint) of the jth class and ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access