Scaling features
Many learning algorithms work better when features take similar ranges of values. In the previous section, we used two features: a binary-valued feature representing the person's sex and a continuous-valued feature representing the person's height in centimeters. Consider a dataset in which we have a man who is 170 cm tall and a woman who is 160 cm tall.
Which instance is closer to a man who is 164 cm tall? For our weight prediction problem, we probably believe that the query is closer to the male instance; a 6 cm difference in height is less important to predicting weight than the difference between sexes. If we represent the height in millimeters, the query instance is closer to the 1600 mm tall female. If we represent ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access