8Dimension Reduction Techniques in Distributional Semantics: An Application Specific Review

Pooja Kherwa1*, Jyoti Khurana2, Rahul Budhraj1, Sakshi Gill1, Shreyansh Sharma1 and Sonia Rathee1

1Department of Computer Science, Maharaja Surajmal Institute of Technology, New Delhi, India

2Department of Information Technology, Maharaja Surajmal Institute of Technology, New Delhi, India

Abstract

In recent years, the data tends to be very large and complex and it becomes very difficult and tedious to work with large datasets containing huge number of features. That’s where Dimensionality Reduction comes into play. Dimensionality Reduction is a pre-processing step in various fields such as machine learning, data mining, statistics etc. and is effective in removing irrelevant and highly redundant data. In this paper, the author’s performed a vast literature survey and aims to provide an adequate application based understanding of various dimensionality reduction techniques and to work as a guide to choose right approach of Dimensionality Reduction for better performance in different applications. Here, the authors have also performed detailed experiments on two different datasets for comparative analysis between various linear and non-linear dimensionality reduction techniques to figure out the effectiveness of the techniques used. PCA, a linear dimensionality reduction technique, outperformed all other techniques used in the experiments. In fact, almost all the linear dimensionality ...

Get Data Wrangling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.