Chapter 6. Figures and Plots

Creating figures and plots is an important step in many analytics projects, usually part of the exploratory data analysis (EDA) phase at the beginning of a project or the reporting phase, where you make your data analysis useful to others. Data visualization enables you to see your variables’ distributions, to see the relationships between your variables, and to check your modeling assumptions.

There are several plotting packages for Python, including matplotlib, pandas, ggplot, and seaborn. Because matplotlib is the most established package—and provides some of the underlying plotting concepts and syntax for the pandas and seaborn packages—we’ll cover it first. Then we’ll see some examples of how the other packages either simplify the plotting syntax or provide additional functionality.

matplotlib

matplotlib is a plotting package designed to create publication-quality figures. It has functions for creating common statistical graphs, including bar plots, box plots, line plots, scatter plots, and histograms. It also has add-in toolkits such as basemap and cartopy for mapping and mplot3d for 3D plotting.

matplotlib provides functions for customizing each component of a figure. For example, it enables you to specify the shape and size of the figure, the limits and scales of the x- and y-axes, the tick marks and labels of the x- and y-axes, the legend, and the title for the figure. You can learn more about customizing figures by perusing the matplotlib ...

Get Foundations for Analytics with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.