2Summarizing Statistical Data
In this chapter, we explore some of the procedures available in R to summarize statistical data, and we give some examples of writing programs.
2.1 Measures of Central Tendency
Measures of central tendency are typical or central points in the data. The most commonly used are the mean and the median.
Mean: The mean is the sum of all values divided by the number of cases, excluding the missing values.
To obtain the mean of the data in Example 1.1 stored in write
mean(downtime)
[1] 25.04348
So the average downtime of all the computers in the laboratory is just over 25 minutes.
Going back to the original data in Exercise 1.1 stored in marks, to obtain the mean, write
mean(marks)
which gives
[1] 57.44
To obtain the mean marks for females, write
mean(marks[1:23])
[1] 65.86957
For males,
mean(marks[24:50])
[1] 50.25926
illustrating that the female average is substantially higher than the male average.
To obtain the mean of the corrected data in Exercise 1.1, recall that the mark of 86 for the 34th student on the list was an error, and that it should have been 46. We changed it with
marks[34] <- 46
The new overall average is
mean(marks)
56.64
and the new male average is
mean(marks[24:50])
[1] 48.77778
increasing the gap between the male and female averages even further.
If we perform a similar operation for the variables in the examination data given ...
Get Probability with R, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.