Relationships between variables – scatterplots

The real power in data analysis is realized when we study how different variables relate. At the end of the previous section, we related income and gender, that is, a quantitative variable with a categorical one. In this section, we will investigate scatterplots, which are a graphical representation of the relationship between two quantitative variables.

To illustrate how Pandas can be used to explore the relationship between two variables, we will use an important example from the history of astronomy. Astronomer Edwin Hubble, in 1929, published a very important paper where he discovered that there is an approximately linear relationship between the distance and velocity of extragalactic nebulae. ...

Get Python: End-to-end Data Analysis now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.