Example of the training and test set

Let's take a use case and work it out in Python. We are going to use Titanic data from Kaggle. The data has been split into two groups:

  • Training set (train.csv)
  • Test set (test.csv)

The data is about the passengers who traveled on the Titanic. It captures their features:

  • pclass: Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd
  • gender: Gender
  • Age: Age in years
  • sibsp: Number of siblings/spouses aboard the Titanic
  • parchNumber of parents/children aboard the Titanic
  • ticket: Ticket number
  • fare Passenger: Fare
  • cabin: Cabin number
  • embarked: Port of embarkation C = Cherbourg, Q = Queenstown, and S = Southampton

We have got to build the model to predict whether or not they survived the sinking of the Titanic. Initially, ...

Get Machine Learning Quick Reference now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.