First, let's import the necessary Python libraries:
- Import the required Python libraries:
import pandas as pdimport matplotlib.pyplot as plt
- Let's load a few variables from the dataset into a pandas dataframe and inspect the first five rows:
cols = ['AGE', 'NUMCHLD', 'INCOME', 'WEALTH1', 'MBCRAFT', 'MBGARDEN', 'MBBOOKS', 'MBCOLECT', 'MAGFAML','MAGFEM', 'MAGMALE']
data = pd.read_csv('cup98LRN.txt', usecols=cols)data.head()
After loading the dataset, this is how the output of head() looks like when we run it from a Jupyter Notebook:
- Let's calculate the number of missing values in each variable:
data.isnull().sum()
The number ...