September 2019
Beginner to intermediate
208 pages
3h 17m
English
Data Visualization is an important part of the data science process. Here we learn how to do data science in both languages.
The anscombe dataset shows the importance of data visualization. On statistical examination it shows data is similar. But on visualization it shows the data is very different.
| Property | Value |
| Mean of x | 9 |
| Sample variance of x | 11 |
| Mean of y | 7.50 |
| Sample variance of y | 4.125 |
| Correlation between x and y | 0.816 |
| Linear regression line | y = 3.00 + 0.500x |
| Coefficient of determination of the linear regression | 0.67 |
But the graphs are quite different.
Figure 9.1 Anscombe Dataset in R.
We are going to do the following graphs in this chapter for both SAS and R: