Combining multiple variables – feature interaction

Among all the features of the click log data, some are very weak signals in themselves. For example, gender itself doesn't tell you much regarding whether someone will click an ad, and the device model itself doesn't provide much information either. However, by combining multiple features, we will be able to create a stronger synthesized signal. Feature interaction is introduced for this purpose. For numerical features, it usually generates new features by multiplying multiples of them. We can also define whatever integration rules we want. For example, we generate an additional feature, income/person, from two original features, household income and household size:

For categorical features, ...

Get Python Machine Learning By Example - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.