25Integrating TF-IDF Features to Divide Amazon Product Reviews into Positive and Negative Groups

Ankit More1*, Abhishek Mishra2, Prakash Maravi3, Prathamesh Muzumdar4 and Abhishek Sharma2

1Department of Computer Science Engineering Shri Vaishnav Vidyapeeth Vishwavidyalaya, Indore, Madhya Pradesh, India

2Shri Vaishnav Vidyapeeth Vishwavidyalaya, Indore, Madhya Pradesh, India

3CSE Department, Sati Polytechnic College, Vidisha (M.P.), India

4Department of Management, Suresh Gyan Vihar University, Jaipur, Rajasthan, India

Abstract

Many real-world applications benefit from text mining and similar techniques, such as customer management in business intelligence systems, retrieving medical data and using language analysis to identify fraud. There is usage of natural language processing and data mining. In these applications to help process the data and extract the necessary patterns. Text mining algorithms are employed in this work to classify product reviews on Amazon. This analysis can be useful for consumers in helping them decide what to buy, and by providing feedback on the product, it might also help the person who produces it. The dataset of Amazon product reviews originating from Kaggle is utilized in this instance. In order to remove stop words and special characters from the text data, the data set is first preprocessed. The second step after that several feature selection methods have been used. Possible keywords are selected from the reviews using the Part of Speech Tagging ...

Get Online Social Networks in Business Frameworks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.