O'Reilly logo

R Data Analysis Cookbook - Second Edition by Kuntal Ganguly

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Introduction

Data analysts often come across large datasets of unlabeled information with high dimensions/features and often seek to reduce the complexity of the data by applying Clustering or PCA. Clustering is a data analysis technique used to discover groups of similar objects (close in terms of distance) or patterns in a dataset. Unlike supervised learning techniques (such as classification and regression), a clustering analysis does not use any labeled data, instead it uses the similarity between data features to group them into clusters. There are two standard clustering strategies: partitioning methods and hierarchical clustering.

PCA is a dimensionality reduction technique that transforms m-dimensional input space to n-dimensional ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required