Example of the training and test set

Let's take a use case and work it out in Python. We are going to use Titanic data from Kaggle. The data has been split into two groups:

  • Training set (train.csv)
  • Test set (test.csv)

The data is about the passengers who traveled on the Titanic. It captures their features:

  • pclass: Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd
  • gender: Gender
  • Age: Age in years
  • sibsp: Number of siblings/spouses aboard the Titanic
  • parchNumber of parents/children aboard the Titanic
  • ticket: Ticket number
  • fare Passenger: Fare
  • cabin: Cabin number
  • embarked: Port of embarkation C = Cherbourg, Q = Queenstown, and S = Southampton

We have got to build the model to predict whether or not they survived the sinking of the Titanic. Initially, ...

Get Machine Learning Quick Reference now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.