Data binarization
One of the most basic forms of transformation, based on raw data counting, is binarization, which consists of assigning the value 1 to all the counts greater than 0, and assigning the value 0 in the remaining cases. To understand the usefulness of binarization, we only need to consider the development of a predictive model whose goal is to predict user preferences based on video visualizations. We could therefore decide to assess the preferences of the individual users simply by counting their respective visualizations of videos; however, the problem is that the order of magnitude of the visualizations varies according to the habits of the individual users.
Therefore, the absolute value of the visualizations—that is, the ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access