# 2Summarizing Statistical Data

In this chapter, we explore some of the procedures available in *R* to summarize statistical data, and we give some examples of writing programs.

## 2.1 Measures of Central Tendency

Measures of central tendency are typical or central points in the data. The most commonly used are the mean and the median.

**Mean:** The mean is the sum of all values divided by the number of cases, excluding the missing values.

To obtain the mean of the data in Example 1.1 stored in write

` ``mean(downtime)`

` ``[1] 25.04348`

So the average downtime of all the computers in the laboratory is just over 25 minutes.

Going back to the original data in Exercise 1.1 stored in *marks*, to obtain the mean, write

` ``mean(marks)`

which gives

` ``[1] 57.44`

To obtain the mean marks for females, write

`mean(marks[1:23])`

`[1] 65.86957`

For males,

`mean(marks[24:50])`

`[1] 50.25926`

illustrating that the female average is substantially higher than the male average.

To obtain the mean of the corrected data in Exercise 1.1, recall that the mark of 86 for the 34th student on the list was an error, and that it should have been 46. We changed it with

` ``marks[34] <- 46`

The new overall average is

`mean(marks)`

`56.64`

and the new male average is

`mean(marks[24:50])`

`[1] 48.77778`

increasing the gap between the male and female averages even further.

If we perform a similar operation for the variables in the examination data given ...

Get *Probability with R, 2nd Edition* now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.