EDA is among the first few tasks we perform when we get started on any ML project. As discussed in the section on CRISP-DM, data understanding is an important step to uncover various insights about the data and better understand the business requirements and context.
In this section, we will take up an actual dataset and perform EDA using pandas as our data manipulation library, coupled with seaborn for visualization. Complete code snippets and details for this analysis are available in the Python Notebook game_of_thrones_eda.ipynb.
We first begin by importing the required libraries and setting up the configurations as shown in the following snippet:
In [1]: import numpy as np ...: import pandas as pd ...: from ...