O'Reilly logo

Effective Amazon Machine Learning by Alexis Perrier

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Creating Datasources from Redshift

In this chapter, we will use the power of SQL queries to address non-linear datasets. Creating datasources in Redshift or RDS gives us the potential for upstream SQL-based feature engineering prior to the datasource creation. We implemented a similar approach in Chapter 4, Loading and Preparing the Dataset, by leveraging the new AWS Athena service to apply preliminary transformations on the data before creating the datasource. This enabled us to expand the Titanic dataset by creating new features, such as the Deck number, replacing the Fare with its log or replacing missing values for the Age variable. The SQL transformations were simple, but allowed us to expand the original dataset in a very flexible way. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required