EDA is perhaps one of the most important steps in the machine learning workflow, and pandas makes it extremely easy to visualize data in Python. pandas provides a high-level API for the popular matplotlib library, which makes it easy to construct plots directly from DataFrames.
As an example, let's visualize the Iris dataset using pandas to uncover important insights. Let's plot a scatterplot to visualize how sepal_width is related to sepal_length. We can construct a scatterplot easily using the DataFrame.plot.scatter() method, which is built into all DataFrames:
# Define marker shapes by classimport matplotlib.pyplot as pltmarker_shapes = ['.', '^', '*']# Then, plot the scatterplotax = plt.axes()for i, species ...