Let's take a use case and work it out in Python. We are going to use Titanic data from Kaggle. The data has been split into two groups:
- Training set (train.csv)
- Test set (test.csv)
The data is about the passengers who traveled on the Titanic. It captures their features:
- pclass: Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd
- gender: Gender
- Age: Age in years
- sibsp: Number of siblings/spouses aboard the Titanic
- parch: Number of parents/children aboard the Titanic
- ticket: Ticket number
- fare Passenger: Fare
- cabin: Cabin number
- embarked: Port of embarkation C = Cherbourg, Q = Queenstown, and S = Southampton
We have got to build the model to predict whether or not they survived the sinking of the Titanic. Initially, ...