Exploratory data analysis

Let's dive into the dataset to understand the kind of data we are working with. We import the dataset into pandas:

import pandas as pddf = pd.read_csv('diabetes.csv')

Let's take a quick look at the first five rows of the dataset by calling the df.head() command:


We get the following output:

It looks like there are nine columns in the dataset, which are as follows:

  • Pregnancies: Number of previous pregnancies 
  • Glucose: Plasma glucose concentration
  • BloodPressure: Diastolic blood pressure
  • SkinThickness: Skin fold thickness measured from the triceps
  • Insulin : Blood serum insulin concentration
  • BMI: Body ...

