Biclustering

In some specific scenarios, the dataset can be structured like a matrix, where the rows represent a category and the columns represent another category. For example, let's suppose we have a set of feature vectors representing the preference (or rating) that a user expressed for a group of items. In this example, we can randomly create such a matrix, forcing 50% of ratings to be null (this is realistic considering that a user never rates all possible items):

import numpy as npnb_users = 100nb_products = 150max_rating = 10up_matrix = np.random.randint(0, max_rating + 1, size=(nb_users, nb_products))mask_matrix = np.random.randint(0, 2, size=(nb_users, nb_products))up_matrix *= mask_matrix

In this case, we are assuming that 0 means ...

Get Machine Learning Algorithms - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.