Like most of the machine learning models we have encountered so far, k-means clustering requires numerical vectors as input. The same feature extraction and transformation approaches that we have seen for classification and regression are applicable for clustering.
As k-means, like least squares regression, uses a squared error function as the optimization objective, it tends to be impacted by outliers and features with large variance.
Clustering could be leveraged to detect outliers as they can cause a lot of problems.
As for regression and classification cases, input data can be normalized and standardized to overcome this, which might improve accuracy. In some cases, however, it might be desirable ...