11.3 Empirical distributions for grouped data

For grouped data, construction of the empirical distribution as defined previously is not possible. However, it is possible to approximate the empirical distribution. The strategy is to obtain values of the empirical distribution function wherever possible and then connect those values in some reasonable way. For grouped data, the distribution function is usually approximated by connecting the points with straight lines. For notation, let the group boundaries be c₀ < c₁ < ··· < c_k, where often c₀ = 0 and c_k = ∞. The number of observations falling between c_j-1 and c_j is denoted n_j with For such data, we are able to determine the empirical distribution at each group boundary. That is, . Note that no rule is proposed for observations that fall on a group boundary. There is no correct approach, but whatever approach is used, consistency in assignment of observations to groups should be used. Note that in Data Set C it is not possible to tell how the assignments were made. If we had that knowledge, it would not affect any subsequent calculations.²

Definition 11.8 For grouped data, the distribution function obtained by connecting the values of the empirical distribution function at the group boundaries with straight lines is called the ogive ...

Get Loss Models: From Data to Decisions, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Loss Models: From Data to Decisions, 4th Edition by Stuart A. Klugman, Harry H. Panjer, Gordon E. Willmot

11.3 Empirical distributions for grouped data

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly