Coupling variables

The Cartesian product transformation combines two categorical or text variables into one. Consider, for instance, a dataset of books and for each book, their title and genre. We could imagine that the title of a book has some correlation with its genre, and creating a new title_genre variable would bring forth that relation.

Consider the following four books, their titles, and genres. Coupling the words in the title with the genre of the book adds extra information to the words in the title. Information that the model could use effectively. This is illustrated in the title_genre column in the following table:

Title Genre title_genre
All the Birds in the Sky scifi {all_scifi, birds_scifi, sky_scifi}
Robots and Empire ...

Get Effective Amazon Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.