7Feature Engineering and Selection Approach Over Malicious Image

P.M. Kavitha1* and B. Muruganantham2

1 Department of Computer Applications, SRM Institute of Science and Technology, Ramapuram, Chennai, India

2 Department of Computer Science and Engineering, SRM Institute of Science and Technology, Kattankulathur, Chennai, India

Abstract

Raw data gets transformed into features representing the problem in a more improved manner. The problem gets represented in predictive models and in turn the accuracy is at a higher rate on the data. With feature engineering, the input data is turned into the machine learning algorithms, using the domain knowledge. Various feature extraction techniques like color, texture, shape, position and edge are worked upon a gray scale image to analyze the malware and its features. The malware image being a gray scale image, the featuring methods are performed to work on the malware. An irrelevant or partially relevant model feature may depreciate the model performance. When a model is first designed, feature selection and data cleaning are the foremost steps to be attended to. The feature selection models that are in practice are Univariate Selection, Feature Importance, Heatmap. These models can be utilized to improve the result on a model when applied.

Keywords: Heatmap, malware variant, handling outliers, Binning, GAN

7.1 Introduction

For the construction of all the machine learning and deep learning models, the data is the most important and compromising ...

Get Data Engineering and Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.