O'Reilly logo

Effective Amazon Machine Learning by Alexis Perrier

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Creating a titanic database

We are going to start from scratch and go back to the original Titanic dataset available at https://github.com/alexperrier/packt-aml/blob/master/ch4/original_titanic.csv. Follow these steps to prepare the CSV file:

  1. Open the original_titanic.csv file.
  2. Remove the header row.
  3. Remove the following punctuation characters: ,"().

The file should only contain data, not column names. This is the original file with 1309 rows. These rows are ordered by pclass and alphabetical names. The resulting file is available at https://github.com/alexperrier/packt-aml/blob/master/ch4/titanic_for_athena.csv. Let us create a new athena_data folder in our S3 bucket and upload the titanic_for_athena.csv file. Now go to the Athena console. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required