9Data Visualization
Data Visualization is an important part of the data science process. Here we learn how to do data science in both languages.
9.1 Importance of Data Visualization
The anscombe dataset shows the importance of data visualization. On statistical examination it shows data is similar. But on visualization it shows the data is very different.
Property | Value |
Mean of x | 9 |
Sample variance of x | 11 |
Mean of y | 7.50 |
Sample variance of y | 4.125 |
Correlation between x and y | 0.816 |
Linear regression line | y = 3.00 + 0.500x |
Coefficient of determination of the linear regression | 0.67 |
But the graphs are quite different.
We are going to do the following graphs in this chapter for both SAS and R:
- Bar Plot: A bar chart represents data in vertical bars with height of the bar proportional to the value of the variable.
- Bar‐Line Plot: A combination of Bar Plots with Line Graphs, with one quantity being represented in a Bar Plot and the other in a Line Graph.
- Box Plot: A plot in which a rectangle is drawn to represent the second and third quartiles, usually with a vertical line inside to indicate ...
Get SAS for R Users now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.