Appendix 3.A Descriptive Characteristics of Grouped Data

Let us assume that a data set has been summarized in the form of an absolute frequency distribution involving classes of a variable X and absolute class frequencies fj (see Chapter 2). This type of data set appears in Table 3.A.1. Quite often one is faced with analyzing a set of observations presented in this format. (For instance, some data sets contain proprietary information and the owner will only agree to present the data in summary form and not release the individual observation values. Or maybe the data set is just too large to have all of its observations printed in some report or document.) Given that the individual observations lose their identity in the grouping process, can we still find the mean, median, mode, standard deviation, and quantiles of X? We can, but not by using the formulas presented earlier in this chapter. In fact, we can only get “approximations” to the mean, median, and so on. However, as we shall soon see, these approximations are quite good. In what follows, we shall assume that we are dealing with a sample of size n.

Table 3.A.1 Absolute Frequency Distribution.

Classes of X fj
20–29 3
30–39 7
40–49 8
20–59 12
60–69 10
70–79 6
80–89 4
50

3.A.1 The Arithmetic Mean

To approximate img, let us use the formula

(3.A.1)

where mj is the class mark (midpoint) of the jth class and ...

Get Statistical Inference: A Short Course now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.